-
21.
公开(公告)号:US11526668B2
公开(公告)日:2022-12-13
申请号:US17095955
申请日:2020-11-12
IPC: G06F40/279 , G06F16/9032 , G06N20/00 , G06F40/205 , G06K9/62
Abstract: A method and apparatus for obtaining word vectors based on a language model, a device and a storage medium are disclosed, which relates to the field of natural language processing technologies in artificial intelligence. An implementation includes inputting each of at least two first sample text language materials into the language model, and outputting a context vector of a first word mask in each first sample text language material via the language model; determining the word vector corresponding to each first word mask based on a first word vector parameter matrix, a second word vector parameter matrix and a fully connected matrix respectively; and training the language model and the fully connected matrix based on the word vectors corresponding to the first word masks in the at least two first sample text language materials, so as to obtain the word vectors.
-
22.
公开(公告)号:US20220004716A1
公开(公告)日:2022-01-06
申请号:US17209124
申请日:2021-03-22
Inventor: Shuohuan Wang , Jiaxiang Liu , Xuan Ouyang , Yu Sun , Hua Wu , Haifeng Wang
Abstract: The present application discloses a method and apparatus for training a semantic representation model, a device and a computer storage medium, which relates to the field of natural language processing technologies in artificial intelligence. An implementation includes: acquiring a semantic representation model which has been trained for a first language as a first semantic representation model; taking a bottom layer and a top layer of the first semantic representation model as trained layers, initializing the trained layers, keeping model parameters of other layers unchanged, and training the trained layers using training language materials of a second language until a training ending condition is met; successively bringing the untrained layers into the trained layers from bottom to top, and executing these layers respectively: keeping the model parameters of other layers than the trained layers unchanged, and training the trained layers using the training language materials of the second language until the training ending condition is met respectively; and obtaining a semantic representation model for the second language after all the layers are trained.
-
公开(公告)号:US20210232765A1
公开(公告)日:2021-07-29
申请号:US16988907
申请日:2020-08-10
Inventor: Han Zhang , Dongling Xiao , Yukun Li , Yu Sun , Hao Tian , Hua Wu , Haifeng Wang
IPC: G06F40/274 , G06F40/30 , G06F40/56 , G06K9/62
Abstract: The present disclosure discloses a method and an apparatus for generating a text based on a semantic representation and relates to a field of natural language processing (NLP) technologies. The method for generating the text includes: obtaining an input text, the input text comprising a source text; obtaining a placeholder of an ith word to be predicted in a target text; obtaining a vector representation of the ith word to be predicted, in which the vector representation of the ith word to be predicted is obtained by calculating the placeholder of the ith word to be predicted, the source text and 1st to (i−1)th predicted words by employing a self-attention mechanism; and generating an ith predicted word based on the vector representation of the ith word to be predicted, to obtain a target text.
-
24.
公开(公告)号:US12118063B2
公开(公告)日:2024-10-15
申请号:US17209051
申请日:2021-03-22
IPC: G06F40/30 , G06F18/214 , G06F18/2413
CPC classification number: G06F18/2148 , G06F18/24147 , G06F40/30
Abstract: The present disclosure provides a method, apparatus, electronic device and storage medium for training a semantic similarity model, which relates to the field of artificial intelligence. A specific implementation solution is as follows: obtaining a target field to be used by a semantic similarity model to be trained; calculating respective correlations between the target field and application fields corresponding to each of training datasets in known multiple training datasets; training the semantic similarity model with the training datasets in turn, according to the respective correlations between the target field and the application fields corresponding to each of the training datasets. According to the technical solution of the present disclosure, it is possible to, in the fine-tuning phase, more purposefully train the semantic similarity model with the training datasets with reference to the correlations between the target field and the application fields corresponding to the training datasets, thereby effectively improving the learning capability of the sematic similarity model and effectively improving the accuracy of the trained semantic similarity model.
-
公开(公告)号:US11663404B2
公开(公告)日:2023-05-30
申请号:US17101789
申请日:2020-11-23
Inventor: Shuohuan Wang , Siyu Ding , Yu Sun , Hua Wu , Haifeng Wang
IPC: G06F40/279 , G06N20/00 , G06F40/166 , G06F40/30
CPC classification number: G06F40/279 , G06F40/166 , G06F40/30 , G06N20/00
Abstract: The disclosure provides a text recognition method, an electronic device, and a storage medium. The method includes: obtaining N segments of a sample text; inputting each of the N segments into a preset initial language model in sequence, to obtain first text vector information corresponding to the N segments; inputting each of the N segments into the initial language model in sequence again, to obtain second text vector information corresponding to a currently input segment; in response to determining that the currently input segment has the mask, predicting the mask according to the second text vector information and the first text vector information to obtain a predicted word at a target position corresponding to the mask; training the initial language model according to an original word and the predicted word to generate a long text language model; and recognizing an input text through the long text language model.
-
公开(公告)号:US11481419B2
公开(公告)日:2022-10-25
申请号:US15981334
申请日:2018-05-16
Inventor: Shengxian Wan , Yu Sun , Dianhai Yu
IPC: G06N3/04 , G06F16/33 , G06N5/04 , G06K9/62 , G06F40/258 , G06V30/262
Abstract: The present disclosure provides a method and apparatus for evaluating a matching degree based on artificial intelligence, a device and a storage medium, wherein the method comprises: respectively obtaining word expressions of words in a query and word expressions of words in a title; respectively obtaining context-based word expressions of words in the query and context-based word expressions of words in the title according to the word expressions; generating matching features according to obtained information; determining a matching degree score between the query and the title according to the matching features. The solution of the present disclosure may be applied to improve the accuracy of the evaluation result.
-
公开(公告)号:US20200210522A1
公开(公告)日:2020-07-02
申请号:US16691104
申请日:2019-11-21
Inventor: Jingwei Wang , Ao Zhang , Jiaxiang Liu , Yu Sun , Zhi Li
Abstract: Embodiments of the present disclosure disclose a method and apparatus for determining a topic. A specific embodiment of the method comprises: determining a to-be-recognized sentence sequence; calculating similarities between the to-be-recognized sentence sequence and each of topic templates in a topic template set in a target area, the each of the topic templates in the topic template set corresponding to a topic in at least one topic in the target area, the topic template including a topic section sequence, and a topic section including a topic sentence sequence; and determining a topic of the to-be-recognized sentence sequence according to an associated parameter, the associated parameter including the similarities between the to-be-recognized sentence sequence and the each of the topic templates in the topic template set. This embodiment reduces labor costs during a topic segmentation.
-
-
-
-
-
-