Patent search ap:("Beijing Baidu Netcom Science AND Technology Co. Page Ltd.") AND inv:"Shuohuan Wang"

1.

发明授权
Method and apparatus for training semantic representation model, device and computer storage medium 有权

公开(公告)号：US11914964B2

公开(公告)日：2024-02-27

申请号：US17209124

申请日：2021-03-22

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Shuohuan Wang , Jiaxiang Liu , Xuan Ouyang , Yu Sun , Hua Wu , Haifeng Wang

IPC: G06F40/30 , G06N20/00 , G06N5/04

CPC classification number: G06F40/30 , G06N5/04 , G06N20/00

Abstract: The present application discloses a method and apparatus for training a semantic representation model, a device and a computer storage medium, which relates to the field of natural language processing technologies in artificial intelligence. An implementation includes: acquiring a semantic representation model which has been trained for a first language as a first semantic representation model; taking a bottom layer and a top layer of the first semantic representation model as trained layers, initializing the trained layers, keeping model parameters of other layers unchanged, and training the trained layers using training language materials of a second language until a training ending condition is met; successively bringing the untrained layers into the trained layers from bottom to top, and executing these layers respectively: keeping the model parameters of other layers than the trained layers unchanged, and training the trained layers using the training language materials of the second language until the training ending condition is met respectively; and obtaining a semantic representation model for the second language after all the layers are trained.

2.

发明申请
METHOD AND APPARATUS FOR TRAINING SEMANTIC REPRESENTATION MODEL, DEVICE AND COMPUTER STORAGE MEDIUM 有权

公开(公告)号：US20220004716A1

公开(公告)日：2022-01-06

申请号：US17209124

申请日：2021-03-22

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Shuohuan Wang , Jiaxiang Liu , Xuan Ouyang , Yu Sun , Hua Wu , Haifeng Wang

IPC: G06F40/30 , G06N20/00 , G06N5/04

Abstract: The present application discloses a method and apparatus for training a semantic representation model, a device and a computer storage medium, which relates to the field of natural language processing technologies in artificial intelligence. An implementation includes: acquiring a semantic representation model which has been trained for a first language as a first semantic representation model; taking a bottom layer and a top layer of the first semantic representation model as trained layers, initializing the trained layers, keeping model parameters of other layers unchanged, and training the trained layers using training language materials of a second language until a training ending condition is met; successively bringing the untrained layers into the trained layers from bottom to top, and executing these layers respectively: keeping the model parameters of other layers than the trained layers unchanged, and training the trained layers using the training language materials of the second language until the training ending condition is met respectively; and obtaining a semantic representation model for the second language after all the layers are trained.

3.

发明授权
Multi-lingual model training method, apparatus, electronic device and readable storage medium 有权

公开(公告)号：US11995405B2

公开(公告)日：2024-05-28

申请号：US17348104

申请日：2021-06-15

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xuan Ouyang , Shuohuan Wang , Chao Pang , Yu Sun , Hao Tian , Hua Wu , Haifeng Wang

IPC: G06F40/30 , G06F40/58 , G06N20/00

CPC classification number: G06F40/30 , G06F40/58 , G06N20/00

Abstract: The present disclosure provides a multi-lingual model training method, apparatus, electronic device and readable storage medium and relates to the technical field of deep learning and natural language processing. A technical solution of the present disclosure when training the multi-lingual model is: obtaining training corpuses comprising a plurality of bilingual corpuses and a plurality of monolingual corpuses; training a multi-lingual model with a first training task by using the plurality of bilingual corpuses; training the multi-lingual model with a second training task by using the plurality of monolingual corpuses; and completing the training of the multi-lingual model in a case of determining that loss functions of the first training task and second training task converge. In the present disclosure, the multi-lingual model can be enabled to achieve semantic interaction between different languages and improve the accuracy of the multi-lingual model in learning the semantic representations of the multi-lingual model.

4.

发明申请
METHOD AND APPARATUS FOR TRAINING NATURAL LANGUAGE PROCESSING MODEL, DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220019736A1

公开(公告)日：2022-01-20

申请号：US17211669

申请日：2021-03-24

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Xuan Ouyang , Shuohuan Wang , Yu Sun

IPC: G06F40/253 , G06F40/166

Abstract: The present application discloses a method and apparatus for training a natural language processing model, a device and a storage medium, which relates to the natural language processing field based on artificial intelligence. An implementation includes: constructing training language material pairs of a coreference resolution task based on a preset language material set, wherein each training language material pair includes a positive sample and a negative sample; training the natural language processing model with the training language material pair to enable the natural language processing model to learn the capability of recognizing corresponding positive samples and negative samples; and training the natural language processing model with the positive samples of the training language material pairs to enable the natural language processing model to learn the capability of the coreference resolution task.

5.

发明授权
Method, apparatus, device, and storage medium for learning knowledge representation 有权

公开(公告)号：US11687718B2

公开(公告)日：2023-06-27

申请号：US17116846

申请日：2020-12-09

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Chao Pang , Shuohuan Wang , Yu Sun , Hua Wu , Haifeng Wang

IPC: G06F17/00 , G06F40/295 , G06F40/137 , G06F40/30

CPC classification number: G06F40/295 , G06F40/137 , G06F40/30

Abstract: A method, an apparatus, a device and a storage medium for learning a knowledge representation are provided. The method can include: sampling a sub-graph of a knowledge graph from a knowledge base; serializing the sub-graph of the knowledge graph to obtain a serialized text; and reading using a pre-trained language model the serialized text in an order in the sub-graph of the knowledge graph, to perform learning to obtain a knowledge representation of each word in the serialized text. The knowledge representation learning in this embodiment is performed for entity and relationship representation learning in the knowledge base.

6.

发明授权
Method, apparatus, electronic device and storage medium for processing a semantic representation model 有权

公开(公告)号：US11520991B2

公开(公告)日：2022-12-06

申请号：US16885358

申请日：2020-05-28

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Yu Sun , Haifeng Wang , Shuohuan Wang , Yukun Li , Shikun Feng , Hao Tian , Hua Wu

IPC: G06F40/30 , G06F40/40

Abstract: The present disclosure provides a method, apparatus, electronic device and storage medium for processing a semantic representation model, and relates to the field of artificial intelligence technologies. A specific implementation solution is: collecting a training corpus set including a plurality of training corpuses; training the semantic representation model using the training corpus set based on at least one of lexicon, grammar and semantics. In the present disclosure, by building the unsupervised or weakly-supervised training task at three different levels, namely, lexicon, grammar and semantics, the semantic representation model is enabled to learn knowledge at levels of lexicon, grammar and semantics from massive data, enhance the capability of universal semantic representation and improve the processing effect of the NLP task.

7.

发明授权
Method and apparatus for parsing query based on artificial intelligence, and storage medium 有权

公开(公告)号：US10698932B2

公开(公告)日：2020-06-30

申请号：US15990157

申请日：2018-05-25

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Shuohuan Wang , Yu Sun , Dianhai Yu

IPC: G06F16/332 , G06N20/00 , G06F40/35 , G06F40/205 , G06F40/216 , G06F40/242 , G06N3/04 , G06N3/08 , G06N5/02 , G06N7/00

Abstract: The present disclosure provides a method and apparatus for parsing a query based on artificial intelligence, and a storage medium, wherein the method comprises: regarding any application domain, obtaining a knowledge library corresponding to the application domain; determining a training query serving as a training language material according to the knowledge library; obtaining a deep query parsing model by training according to the training language material; using the deep query parsing model to parse the user's query to obtain a parsing result. The solution of the present disclosure may be applied to improve the accuracy of the parsing result.

8.

发明授权
Text recognition method, electronic device, and storage medium 有权

公开(公告)号：US11663404B2

公开(公告)日：2023-05-30

申请号：US17101789

申请日：2020-11-23

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Shuohuan Wang , Siyu Ding , Yu Sun , Hua Wu , Haifeng Wang

IPC: G06F40/279 , G06N20/00 , G06F40/166 , G06F40/30

CPC classification number: G06F40/279 , G06F40/166 , G06F40/30 , G06N20/00

Abstract: The disclosure provides a text recognition method, an electronic device, and a storage medium. The method includes: obtaining N segments of a sample text; inputting each of the N segments into a preset initial language model in sequence, to obtain first text vector information corresponding to the N segments; inputting each of the N segments into the initial language model in sequence again, to obtain second text vector information corresponding to a currently input segment; in response to determining that the currently input segment has the mask, predicting the mask according to the second text vector information and the first text vector information to obtain a predicted word at a target position corresponding to the mask; training the initial language model according to an original word and the predicted word to generate a long text language model; and recognizing an input text through the long text language model.

9.

发明授权
Method and apparatus for generating semantic representation model, and storage medium 有权

公开(公告)号：US12106052B2

公开(公告)日：2024-10-01

申请号：US17205894

申请日：2021-03-18

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Shuohuan Wang , Siyu Ding , Yu Sun

IPC: G06N5/02 , G06F18/21 , G06F40/279 , G06F40/30

CPC classification number: G06F40/30 , G06F18/2163 , G06F40/279 , G06N5/02

Abstract: The disclosure discloses a method and an apparatus for generating a semantic representation model, and a storage medium. The detailed implementation includes: performing recognition and segmentation on the original text included in an original text set to obtain knowledge units and non-knowledge units in the original text; performing knowledge unit-level disorder processing on the knowledge units and the non-knowledge units in the original text to obtain a disorder text; generating a training text set based on the character attribute of each character in the disorder text; and training an initial semantic representation model by employing the training text set to generate the semantic representation model.

10.

发明授权
Method and apparatus for processing information 有权

公开(公告)号：US11232140B2

公开(公告)日：2022-01-25

申请号：US16054920

申请日：2018-08-03

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Shuohuan Wang , Yu Sun , Dianhai Yu

IPC: G06F16/33 , G10L15/18 , G06F16/31 , G06F16/951 , G06F40/30 , G06F40/211

Abstract: Embodiments of the present disclosure disclose a method and apparatus for processing information. A specific implementation of the method includes: acquiring a search result set related to a search statement inputted by a user; parsing the search statement to generate a first syntax tree, and parsing a search result in the search result set to generate a second syntax tree set; calculating a similarity between the search statement and the search result in the search result set using a pre-trained semantic matching model on the basis of the first syntax tree and the second syntax tree set, the semantic matching model being used to determine the similarity between the syntax trees; and sorting the search result in the search result set on the basis of the similarity between the search statement and the search result in the search result set, and pushing the sorted search result set to the user.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification