-
公开(公告)号:US20230142217A1
公开(公告)日:2023-05-11
申请号:US17896690
申请日:2022-08-26
Inventor: Huihui HE , Leyi WANG , Duohao QIN , Minghao LIU
IPC: G06F40/47 , G06F40/166 , G06F40/30 , G06F40/295 , G06F40/151
CPC classification number: G06F40/47 , G06F40/166 , G06F40/30 , G06F40/295 , G06F40/151
Abstract: The present disclosure provides a model training method and apparatus, an electronic device, and a storage medium, and relates to the field of artificial intelligence, in particular, to the field of natural language processing and deep learning. A specific implementation solution includes: constructing initial training corpora; performing data enhancement on the initial training corpora based on an algorithm contained in a target algorithm set to obtain target training corpora, wherein the target algorithm set is determined from multiple algorithm sets, and different algorithm sets are used for performing data enhancement on corpora with different granularity in the initial training corpora; and performing training on a language model based on the target training corpora to obtain a sequence labeling model, herein the language model is pre-trained based on text corpora.
-
公开(公告)号:US20210081729A1
公开(公告)日:2021-03-18
申请号:US16984231
申请日:2020-08-04
Inventor: Xiangkai HUANG , Leyi WANG , Lei NIE , Siyu AN , Minghao LIU , Jiangliang GUO
Abstract: The present application discloses a method for image text recognition, an apparatus, a device, and a storage medium, and relates to image processing technologies in the field of cloud computing. A specific implementation is: acquiring an image to be processed, where at least one text line exists in the image to be processed; processing each text line in the image to be processed to obtain a composite encoded vector corresponding to each word in each text line, where the composite encoded vector carries semantic information and position information; and determining a text recognition result of the image to be processed according to the semantic information and the position information carried in the composite encoded vector corresponding to each word in each text line. This technical solution can accurately distinguish adjacent fields with small pixel spacing in the image and improve the accuracy of text recognition in the image.
-
公开(公告)号:US20220383190A1
公开(公告)日:2022-12-01
申请号:US17619533
申请日:2021-05-17
Inventor: Huihui HE , Leyi WANG , Minghao LIU , Jiangliang GUO
Abstract: The present disclosure provides a method of training a classification model, which relates to an active learning, neural network and natural language processing technology. A specific implementation scheme includes: selecting, from an original sample set, a plurality of original samples with a class prediction result meeting a preset condition as to-be-labeled samples according to a class prediction result for a plurality of original samples in the original sample set; labeling the to-be-labeled sample as belonging to a class by using the second classification model, so as to obtain a first labeled sample set; and training the first classification model by using the first labeled sample set. The present disclosure further provides a method of classifying a sample, an electronic device, and a storage medium.
-
-