Patent search ap:("BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO. Page LTD.") AND inv:"Xuan OUYANG"

1.

发明申请
MULTI-LINGUAL MODEL TRAINING METHOD, APPARATUS, ELECTRONIC DEVICE AND READABLE STORAGE MEDIUM 有权

公开(公告)号：US20220171941A1

公开(公告)日：2022-06-02

申请号：US17348104

申请日：2021-06-15

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xuan OUYANG , Shuohuan WANG , Chao PANG , Yu SUN , Hao TIAN , Hua WU , Haifeng WANG

IPC: G06F40/30 , G06F40/58 , G06N20/00

Abstract: The present disclosure provides a multi-lingual model training method, apparatus, electronic device and readable storage medium and relates to the technical field of deep learning and natural language processing. A technical solution of the present disclosure when training the multi-lingual model is: obtaining training corpuses comprising a plurality of bilingual corpuses and a plurality of monolingual corpuses; training a multi-lingual model with a first training task by using the plurality of bilingual corpuses; training the multi-lingual model with a second training task by using the plurality of monolingual corpuses; and completing the training of the multi-lingual model in a case of determining that loss functions of the first training task and second training task converge. In the present disclosure, the multi-lingual model can be enabled to achieve semantic interaction between different languages and improve the accuracy of the multi-lingual model in learning the semantic representations of the multi-lingual model.

2.

发明申请
METHOD FOR TRAINING MULTILINGUAL SEMANTIC REPRESENTATION MODEL, DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220019743A1

公开(公告)日：2022-01-20

申请号：US17318577

申请日：2021-05-12

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xuan OUYANG , Shuohuan WANG , Yu SUN

IPC: G06F40/30 , G06F40/237 , G06N20/00 , G06N5/04

Abstract: Technical solutions relate to the natural language processing field based on artificial intelligence. According to an embodiment, a multilingual semantic representation model is trained using a plurality of training language materials represented in a plurality of languages respectively, such that the multilingual semantic representation model learns the semantic representation capability of each language; a corresponding mixed-language language material is generated for each of the plurality of training language materials, and the mixed-language language material includes language materials in at least two languages; and the multilingual semantic representation model is trained using each mixed-language language material and the corresponding training language material, such that the multilingual semantic representation model learns semantic alignment information of different languages.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification