Patent search ap:("BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO. Page LTD.") AND inv:"Yu SUN"

1.

发明申请
MULTI-MODAL PRE-TRAINING MODEL ACQUISITION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220019744A1

公开(公告)日：2022-01-20

申请号：US17319189

申请日：2021-05-13

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Fei YU , Jiji TANG , Weichong YIN , Yu SUN , Hao TIAN , Hua WU , Haifeng WANG

IPC: G06F40/30 , G06N20/00 , G06N5/04

Abstract: A multi-modal pre-training model acquisition method, an electronic device and a storage medium, which relate to the fields of deep learning and natural language processing, are disclosed. The method may include: determining, for each image-text pair as training data, to-be-processed fine-grained semantic word in the text; masking the to-be-processed fine-grained semantic words; and training the multi-modal pre-training model using the training data with the fine-grained semantic words masked.

2.

发明申请
METHOD AND APPARATUS FOR OBTAINING WORD VECTORS BASED ON LANGUAGE MODEL, DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20210374343A1

公开(公告)日：2021-12-02

申请号：US17095955

申请日：2020-11-12

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Zhen LI , Yukun LI , Yu SUN

IPC: G06F40/279 , G06F40/205 , G06F16/9032 , G06K9/62 , G06N20/00

Abstract: A method and apparatus for obtaining word vectors based on a language model, a device and a storage medium are disclosed, which relates to the field of natural language processing technologies in artificial intelligence. An implementation includes inputting each of at least two first sample text language materials into the language model, and outputting a context vector of a first word mask in each first sample text language material via the language model; determining the word vector corresponding to each first word mask based on a first word vector parameter matrix, a second word vector parameter matrix and a fully connected matrix respectively; and training the language model and the fully connected matrix based on the word vectors corresponding to the first word masks in the at least two first sample text language materials, so as to obtain the word vectors.

3.

发明申请
METHOD AND APPARATUS FOR EVALUATING MATCHING DEGREE BASED ON ARTIFICIAL INTELLIGENCE, DEVICE AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20180336206A1

公开(公告)日：2018-11-22

申请号：US15981334

申请日：2018-05-16

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Shengxian WAN , Yu SUN , Dianhai YU

IPC: G06F17/30 , G06F17/27 , G06K9/62 , G06N5/04

Abstract: The present disclosure provides a method and apparatus for evaluating a matching degree based on artificial intelligence, a device and a storage medium, wherein the method comprises: respectively obtaining word expressions of words in a query and word expressions of words in a title; respectively obtaining context-based word expressions of words in the query and context-based word expressions of words in the title according to the word expressions; generating matching features according to obtained information; determining a matching degree score between the query and the title according to the matching features. The solution of the present disclosure may be applied to improve the accuracy of the evaluation result.

4.

发明申请
MULTI-LINGUAL MODEL TRAINING METHOD, APPARATUS, ELECTRONIC DEVICE AND READABLE STORAGE MEDIUM 有权

公开(公告)号：US20220171941A1

公开(公告)日：2022-06-02

申请号：US17348104

申请日：2021-06-15

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xuan OUYANG , Shuohuan WANG , Chao PANG , Yu SUN , Hao TIAN , Hua WU , Haifeng WANG

IPC: G06F40/30 , G06F40/58 , G06N20/00

Abstract: The present disclosure provides a multi-lingual model training method, apparatus, electronic device and readable storage medium and relates to the technical field of deep learning and natural language processing. A technical solution of the present disclosure when training the multi-lingual model is: obtaining training corpuses comprising a plurality of bilingual corpuses and a plurality of monolingual corpuses; training a multi-lingual model with a first training task by using the plurality of bilingual corpuses; training the multi-lingual model with a second training task by using the plurality of monolingual corpuses; and completing the training of the multi-lingual model in a case of determining that loss functions of the first training task and second training task converge. In the present disclosure, the multi-lingual model can be enabled to achieve semantic interaction between different languages and improve the accuracy of the multi-lingual model in learning the semantic representations of the multi-lingual model.

5.

发明申请
METHOD AND APPARATUS FOR GENERATING VECTOR REPRESENTATION OF TEXT, AND RELATED COMPUTER DEVICE 有权

公开(公告)号：US20210192141A1

公开(公告)日：2021-06-24

申请号：US16939947

申请日：2020-07-27

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Chao PANG , Shuohuan WANG , Yu SUN , Zhi LI

IPC: G06F40/30 , G06N20/00

Abstract: A method for generating a vector representation of a text includes dividing the text into text segments. Each text segment is represented as a segment vector corresponding to the respective text segment by employing a first-level semantic model. The segment vector is configured to indicate a semantics of the text segment. Text semantics recognition is performed on the segment vector of each text segment by employing a second-level semantic model to obtain a text vector for indicating a topic of the text.

6.

发明申请
TEXT RECOGNITION METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20210383064A1

公开(公告)日：2021-12-09

申请号：US17101789

申请日：2020-11-23

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Shuohuan WANG , Siyu DING , Yu SUN , Hua WU , Haifeng WANG

IPC: G06F40/279 , G06F40/166 , G06F40/30 , G06N20/00

Abstract: The disclosure provides a text recognition method, an electronic device, and a storage medium. The method includes: obtaining N segments of a sample text; inputting each of the N segments into a preset initial language model in sequence, to obtain first text vector information corresponding to the N segments; inputting each of the N segments into the initial language model in sequence again, to obtain second text vector information corresponding to a currently input segment; in response to determining that the currently input segment has the mask, predicting the mask according to the second text vector information and the first text vector information to obtain a predicted word at a target position corresponding to the mask; training the initial language model according to an original word and the predicted word to generate a long text language model; and recognizing an input text through the long text language model.

7.

发明申请
METHOD AND APPARATUS FOR ADVERSARIAL TRAINING OF MACHINE LEARNING MODEL, AND MEDIUM 有权

公开(公告)号：US20210334659A1

公开(公告)日：2021-10-28

申请号：US17369699

申请日：2021-07-07

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Siyu DING , Shuohuan WANG , Yu SUN

IPC: G06N3/08 , G06N3/04

Abstract: The present application discloses a method and an apparatus for adversarial training of a machine learning (ML) model and a medium. The method includes: obtaining input information in a training sample; extracting features of a plurality of input characters in the input information; inputting the features of the plurality of input characters to the ML model, to capture an attention weight on an input character of the plurality of input characters by an attention layer of the ML model; disturbing the attention weight captured by the attention layer, so that the ML model outputs a predicted character according to the attention weight disturbed; and training the ML model according to a difference between the predicted character and a labeled character in the training sample.

8.

发明申请
METHOD AND APPARATUS FOR TRAINING PRE-TRAINED KNOWLEDGE MODEL, AND ELECTRONIC DEVICE 有权

公开(公告)号：US20210248498A1

公开(公告)日：2021-08-12

申请号：US17241999

申请日：2021-04-27

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Chao PANG , Shuohuan WANG , Yu SUN , Zhi LI

IPC: G06N5/04 , G06F40/30 , G06N20/00

Abstract: A method for training a pre-trained knowledge model includes: obtaining a training text, in which the training text includes a structured knowledge text and an article corresponding to the structured knowledge text, and the structured knowledge text includes a head node, a tail node, and a relationship between the head node and the tail node; and training a pre-trained knowledge model to be trained according to the training text.

9.

发明申请
LANGUAGE GENERATION METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20210232775A1

公开(公告)日：2021-07-29

申请号：US17031569

申请日：2020-09-24

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Han ZHANG , Dongling XIAO , Yukun LI , Yu SUN , Hao TIAN , Hua WU , Haifeng WANG

IPC: G06F40/56

Abstract: The present disclosure proposes a language generation method and apparatus. The method includes: performing encoding processing on an input sequence by using a preset encoder to generate a hidden state vector corresponding to the input sequence; in response to a granularity category of a second target segment being a phrase, decoding a first target segment vector, the hidden state vector, and a position vector corresponding to the second target segment by using N decoders to generate N second target segments; determining a loss value based on differences between respective N second target segments and a second target annotated segment; and performing parameter updating on the preset encoder, a preset classifier, and the N decoders based on the loss value to generate an updated language generation model for performing language generation.

10.

发明申请
METHOD AND APPARATUS FOR PARSING QUERY BASED ON ARTIFICIAL INTELLIGENCE, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20180341698A1

公开(公告)日：2018-11-29

申请号：US15990157

申请日：2018-05-25

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Shuohuan WANG , Yu SUN , Dianhai YU

IPC: G06F17/30 , G06F17/27 , G06N5/02 , G06N99/00

Abstract: The present disclosure provides a method and apparatus for parsing a query based on artificial intelligence, and a storage medium, wherein the method comprises: regarding any application domain, obtaining a knowledge library corresponding to the application domain; determining a training query serving as a training language material according to the knowledge library; obtaining a deep query parsing model by training according to the training language material; using the deep query parsing model to parse the user's query to obtain a parsing result. The solution of the present disclosure may be applied to improve the accuracy of the parsing result.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification