-
1.
公开(公告)号:US20250124350A1
公开(公告)日:2025-04-17
申请号:US18883980
申请日:2024-09-12
Inventor: Yukun MA , Yingtong BU
IPC: G06N20/00 , G06F16/2457
Abstract: A method for training a data annotation model, a data annotation method, and corresponding apparatuses are provided. The method includes: acquiring sample data that satisfies preset annotation conditions, where the preset annotation conditions include individual annotation conditions derived from breakdown of an annotation rule corresponding to an annotation task, and the sample data corresponds to chain-of-thought information, which is used to indicate whether the sample data belongs to an annotation category corresponding to the annotation task, and a corresponding reason; determining a sample question corresponding to the sample data based on the annotation category and the sample data, and determining a sample answer corresponding to the sample data based on the chain-of-thought information; and performing model training based on the sample question and the sample answer to obtain a target data annotation model used to obtain an answer to the question.
-
公开(公告)号:US20250068935A1
公开(公告)日:2025-02-27
申请号:US18814316
申请日:2024-08-23
Inventor: Yukun MA
Abstract: Methods, apparatuses, devices, and media for model processing are provided. In an approach, a first knowledge domain in the field of general knowledge involved in a processing model is determined. A first set of reference queries associated with the first knowledge domain is generated. A first set of reference answers matching the first set of reference queries is obtained from a repository that is associated with the processing model and at least relates to the first knowledge domain. The processing model is updated by using the first set of reference queries and the first set of reference answers. With the implementations of the disclosure, the knowledge reserve in various subdivided knowledge domains can be continuously added to the processing model. The processing model can improve performance and accuracy in this way, providing more accurate query result output.
-
公开(公告)号:US20250061282A1
公开(公告)日:2025-02-20
申请号:US18784900
申请日:2024-07-25
Inventor: Yukun MA
IPC: G06F40/30
Abstract: A method of generating training data, a readable medium and an electronic device are provided. The method includes: acquiring sample data; determining a data generation template according to the sample data; generating question data according to the first data in the sample data and the data generation template, and determining answer data according to the second data other than the first data in the sample data; and combining the question data and the answer data into the training data.
-
公开(公告)号:US20240412115A1
公开(公告)日:2024-12-12
申请号:US18814332
申请日:2024-08-23
Inventor: Yukun MA , Yingtong BU
IPC: G06N20/00
Abstract: A method, apparatus, device and medium for verifying output data of a processing model are provided. In the method, in response to receiving input data for the processing model, output data corresponding to the input data is determined from the processing model. A verification result associated with the input data and the output data is acquired from a verification model corresponding to the processing model, the verification result indicating a matching degree between the output data and the input data. A final result corresponding to the input data is provided based on the verification result and the output data.
-
公开(公告)号:US20240412049A1
公开(公告)日:2024-12-12
申请号:US18814317
申请日:2024-08-23
Applicant: Beijing Youzhuju Network Technology Co., Ltd. , Lemon Inc.
Inventor: Yukun MA , Arinze OFFOR
IPC: G06N3/0475
Abstract: In embodiments of the present disclosure, a solution for model evaluation is provided. The method comprises: providing inputs in an input set to a first generative model, to obtain a first output set output by a first generative model, wherein the first output set comprises a plurality of outputs corresponding to the plurality of inputs; obtaining first labelling information corresponding to a plurality of outputs in the first output set, the first labelling information indicating a quality level of each output marked in a plurality of quality levels divided in each quality evaluation dimension in the plurality of quality evaluation dimensions; and determining a first overall quality score of the first generative model at least based on the first labelling information of the outputs and respective quality scores corresponding to the plurality of quality levels divided in the plurality of quality evaluation dimensions.
-
-
-
-