Model Training Method, Electronic Device, And Storage Medium

    公开(公告)号:US20230142217A1

    公开(公告)日:2023-05-11

    申请号:US17896690

    申请日:2022-08-26

    CPC classification number: G06F40/47 G06F40/166 G06F40/30 G06F40/295 G06F40/151

    Abstract: The present disclosure provides a model training method and apparatus, an electronic device, and a storage medium, and relates to the field of artificial intelligence, in particular, to the field of natural language processing and deep learning. A specific implementation solution includes: constructing initial training corpora; performing data enhancement on the initial training corpora based on an algorithm contained in a target algorithm set to obtain target training corpora, wherein the target algorithm set is determined from multiple algorithm sets, and different algorithm sets are used for performing data enhancement on corpora with different granularity in the initial training corpora; and performing training on a language model based on the target training corpora to obtain a sequence labeling model, herein the language model is pre-trained based on text corpora.

    METHOD FOR IMAGE TEXT RECOGNITION, APPARATUS, DEVICE AND STORAGE MEDIUM

    公开(公告)号:US20210081729A1

    公开(公告)日:2021-03-18

    申请号:US16984231

    申请日:2020-08-04

    Abstract: The present application discloses a method for image text recognition, an apparatus, a device, and a storage medium, and relates to image processing technologies in the field of cloud computing. A specific implementation is: acquiring an image to be processed, where at least one text line exists in the image to be processed; processing each text line in the image to be processed to obtain a composite encoded vector corresponding to each word in each text line, where the composite encoded vector carries semantic information and position information; and determining a text recognition result of the image to be processed according to the semantic information and the position information carried in the composite encoded vector corresponding to each word in each text line. This technical solution can accurately distinguish adjacent fields with small pixel spacing in the image and improve the accuracy of text recognition in the image.

    METHOD OF TRAINING CLASSIFICATION MODEL, METHOD OF CLASSIFYING SAMPLE, AND DEVICE

    公开(公告)号:US20220383190A1

    公开(公告)日:2022-12-01

    申请号:US17619533

    申请日:2021-05-17

    Abstract: The present disclosure provides a method of training a classification model, which relates to an active learning, neural network and natural language processing technology. A specific implementation scheme includes: selecting, from an original sample set, a plurality of original samples with a class prediction result meeting a preset condition as to-be-labeled samples according to a class prediction result for a plurality of original samples in the original sample set; labeling the to-be-labeled sample as belonging to a class by using the second classification model, so as to obtain a first labeled sample set; and training the first classification model by using the first labeled sample set. The present disclosure further provides a method of classifying a sample, an electronic device, and a storage medium.

Patent Agency Ranking