METHOD FOR TRAINING DATA ANNOTATION MODEL, DATA ANNOTATION METHOD, AND CORRESPONDING APPARATUSES

    公开(公告)号:US20250124350A1

    公开(公告)日:2025-04-17

    申请号:US18883980

    申请日:2024-09-12

    Abstract: A method for training a data annotation model, a data annotation method, and corresponding apparatuses are provided. The method includes: acquiring sample data that satisfies preset annotation conditions, where the preset annotation conditions include individual annotation conditions derived from breakdown of an annotation rule corresponding to an annotation task, and the sample data corresponds to chain-of-thought information, which is used to indicate whether the sample data belongs to an annotation category corresponding to the annotation task, and a corresponding reason; determining a sample question corresponding to the sample data based on the annotation category and the sample data, and determining a sample answer corresponding to the sample data based on the chain-of-thought information; and performing model training based on the sample question and the sample answer to obtain a target data annotation model used to obtain an answer to the question.

    MODEL PROCESSING METHOD, APPARATUS, DEVICE AND MEDIUM

    公开(公告)号:US20250068935A1

    公开(公告)日:2025-02-27

    申请号:US18814316

    申请日:2024-08-23

    Inventor: Yukun MA

    Abstract: Methods, apparatuses, devices, and media for model processing are provided. In an approach, a first knowledge domain in the field of general knowledge involved in a processing model is determined. A first set of reference queries associated with the first knowledge domain is generated. A first set of reference answers matching the first set of reference queries is obtained from a repository that is associated with the processing model and at least relates to the first knowledge domain. The processing model is updated by using the first set of reference queries and the first set of reference answers. With the implementations of the disclosure, the knowledge reserve in various subdivided knowledge domains can be continuously added to the processing model. The processing model can improve performance and accuracy in this way, providing more accurate query result output.

    METHOD OF GENERATING TRAINING DATA, READABLE MEDIUM, AND ELECTRONIC DEVICE

    公开(公告)号:US20250061282A1

    公开(公告)日:2025-02-20

    申请号:US18784900

    申请日:2024-07-25

    Inventor: Yukun MA

    Abstract: A method of generating training data, a readable medium and an electronic device are provided. The method includes: acquiring sample data; determining a data generation template according to the sample data; generating question data according to the first data in the sample data and the data generation template, and determining answer data according to the second data other than the first data in the sample data; and combining the question data and the answer data into the training data.

    METHOD, APPARATUS, DEVICE AND MEDIUM FOR VERIFYING OUTPUT DATA OF PROCESSING MODEL

    公开(公告)号:US20240412115A1

    公开(公告)日:2024-12-12

    申请号:US18814332

    申请日:2024-08-23

    Abstract: A method, apparatus, device and medium for verifying output data of a processing model are provided. In the method, in response to receiving input data for the processing model, output data corresponding to the input data is determined from the processing model. A verification result associated with the input data and the output data is acquired from a verification model corresponding to the processing model, the verification result indicating a matching degree between the output data and the input data. A final result corresponding to the input data is provided based on the verification result and the output data.

    METHOD, DEVICE AND STORAGE MEDIUM FOR MODEL EVALUATION

    公开(公告)号:US20240412049A1

    公开(公告)日:2024-12-12

    申请号:US18814317

    申请日:2024-08-23

    Abstract: In embodiments of the present disclosure, a solution for model evaluation is provided. The method comprises: providing inputs in an input set to a first generative model, to obtain a first output set output by a first generative model, wherein the first output set comprises a plurality of outputs corresponding to the plurality of inputs; obtaining first labelling information corresponding to a plurality of outputs in the first output set, the first labelling information indicating a quality level of each output marked in a plurality of quality levels divided in each quality evaluation dimension in the plurality of quality evaluation dimensions; and determining a first overall quality score of the first generative model at least based on the first labelling information of the outputs and respective quality scores corresponding to the plurality of quality levels divided in the plurality of quality evaluation dimensions.

Patent Agency Ranking