-
公开(公告)号:US20250139359A1
公开(公告)日:2025-05-01
申请号:US18943845
申请日:2024-11-11
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yichun Yin , Lifeng Shang , Xin Jiang , Xiao Chen
IPC: G06F40/216 , G06F40/166 , G06F40/279 , G06F40/30 , G06N3/045 , G06N3/084
Abstract: A text processing model training method, and a text processing method and apparatus in the natural language processing field in the artificial intelligence field are disclosed. The training method includes: obtaining training text; separately inputting the training text into a teacher model and a student model to obtain sample data output by the teacher model and prediction data output by the student model; the sample data includes a sample semantic feature and a sample label; the prediction data includes a prediction semantic feature and a prediction label; and the teacher model is a pre-trained language model used for text classification; and training a model parameter of the student model based on the sample data and the prediction data, to obtain a target student model. The method enables the student model to effectively perform knowledge transfer, thereby improving accuracy of a text processing result of the student model.
-
公开(公告)号:US20220180202A1
公开(公告)日:2022-06-09
申请号:US17682145
申请日:2022-02-28
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yichun Yin , Lifeng Shang , Xin Jiang , Xiao Chen
IPC: G06N3/08 , G06N3/04 , G06F40/30 , G06F40/166 , G06F40/279
Abstract: A text processing model training method, and a text processing method and apparatus in the natural language processing field in the artificial intelligence field are disclosed. The training method includes: obtaining training text; separately inputting the training text into a teacher model and a student model to obtain sample data output by the teacher model and prediction data output by the student model; the sample data includes a sample semantic feature and a sample label; the prediction data includes a prediction semantic feature and a prediction label; and the teacher model is a pre-trained language model used for text classification; and training a model parameter of the student model based on the sample data and the prediction data, to obtain a target student model. The method enables the student model to effectively perform knowledge transfer, thereby improving accuracy of a text processing result of the student model.
-
公开(公告)号:US12182507B2
公开(公告)日:2024-12-31
申请号:US17682145
申请日:2022-02-28
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yichun Yin , Lifeng Shang , Xin Jiang , Xiao Chen
IPC: G06F40/216 , G06F40/166 , G06F40/279 , G06F40/30 , G06N3/045 , G06N3/084
Abstract: A text processing model training method, and a text processing method and apparatus in the natural language processing field in the artificial intelligence field are disclosed. The training method includes: obtaining training text; separately inputting the training text into a teacher model and a student model to obtain sample data output by the teacher model and prediction data output by the student model; the sample data includes a sample semantic feature and a sample label; the prediction data includes a prediction semantic feature and a prediction label; and the teacher model is a pre-trained language model used for text classification; and training a model parameter of the student model based on the sample data and the prediction data, to obtain a target student model. The method enables the student model to effectively perform knowledge transfer, thereby improving accuracy of a text processing result of the student model.
-
公开(公告)号:US20240127000A1
公开(公告)日:2024-04-18
申请号:US17958080
申请日:2022-09-30
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Yichun Yin , Lifeng Shang , Cheng Chen , Xin Jiang , Xiao Chen , Qun Liu
Abstract: A computer-implemented method is provided for model training performed by a processing system. The method comprises determining a set of first weights based on a first matrix associated with a source model, determining a set of second weights based on the set of first weights, forming a second matrix associated with a target model based on the set of first weights and the set of second weights, initializing the target model based on the second matrix, and training the target model.
-
-
-