-
公开(公告)号:US10831993B2
公开(公告)日:2020-11-10
申请号:US16306488
申请日:2016-12-22
Inventor: Kunsheng Zhou , Jingzhou He , Lei Shi , Shikun Feng
IPC: G06F40/242 , G06F16/00 , G06F40/30 , G06N3/02 , G06F40/20 , G06F40/284 , G06F17/18 , G06N3/08
Abstract: Disclosed are a method and an apparatus for constructing a binary feature dictionary. The method may include: extracting binary features from a corpus; calculating a preset statistic of each binary feature; and selecting a preset number of binary features in sequence according to the preset statistic to constitute the binary feature dictionary.
-
公开(公告)号:US11520991B2
公开(公告)日:2022-12-06
申请号:US16885358
申请日:2020-05-28
Inventor: Yu Sun , Haifeng Wang , Shuohuan Wang , Yukun Li , Shikun Feng , Hao Tian , Hua Wu
Abstract: The present disclosure provides a method, apparatus, electronic device and storage medium for processing a semantic representation model, and relates to the field of artificial intelligence technologies. A specific implementation solution is: collecting a training corpus set including a plurality of training corpuses; training the semantic representation model using the training corpus set based on at least one of lexicon, grammar and semantics. In the present disclosure, by building the unsupervised or weakly-supervised training task at three different levels, namely, lexicon, grammar and semantics, the semantic representation model is enabled to learn knowledge at levels of lexicon, grammar and semantics from massive data, enhance the capability of universal semantic representation and improve the processing effect of the NLP task.
-
公开(公告)号:US20190057164A1
公开(公告)日:2019-02-21
申请号:US16054559
申请日:2018-08-03
Inventor: Kunsheng Zhou , Shikun Feng , Zhifan Zhu , Jingzhou He
Abstract: The disclosure discloses a search method and apparatus based on artificial intelligence. An embodiment of the method includes: receiving search information entered by a user; determining a candidate to-be-pushed message set based on the search information; predicting a probability of being clicked for a candidate to-be-pushed message in the candidate to-be-pushed message set using a pre-trained scoring model based on the search information and the candidate to-be-pushed message set, the scoring model being obtained by training based on a pre-stored first search information set, a to-be-pushed message set corresponding to a piece of first search information in the first search information set, and a preset priority of a to-be-pushed message in the to-be-pushed message set; and selecting a preset number of the candidate to-be-pushed messages to form a message sequence in descending order of the probability of being clicked, and pushing the message sequence to a terminal of the user.
-
公开(公告)号:US20210342549A1
公开(公告)日:2021-11-04
申请号:US17375156
申请日:2021-07-14
Inventor: Jiaxiang Liu , Shikun Feng
IPC: G06F40/30 , G06F40/58 , G06N20/00 , G06F16/901
Abstract: The disclosure provides a method for training a semantic analysis model, an electronic device and a storage medium. The method includes: obtaining a plurality of training data, in which each of the plurality of training data comprises a search word, information on at least one text obtained by searching the search word, and at least one associated word corresponding to the at least one text; constructing a graph model based on the training data, and determining target training data from the plurality of training data by using the graph model, the target training data comprising search word samples, information samples and associated word samples; and training a semantic analysis model based on the search word samples, the information samples, and the associated word samples.
-
公开(公告)号:US10606949B2
公开(公告)日:2020-03-31
申请号:US15921386
申请日:2018-03-14
Inventor: Zhifan Zhu , Shikun Feng , Kunsheng Zhou , Jingzhou He
Abstract: This disclosure discloses an artificial intelligence based method and apparatus for checking a text. An embodiment of the method comprises: lexing a first to-be-checked text and a second to-be-checked text respectively, determining word vectors of the lexed words to generate a first word vector sequence and a second word vector sequence; inputting the first word vector sequence and the second word vector sequence respectively into a pre-trained convolutional neural network containing at least one multi-scale convolutional layer, identifying vector sequences in a plurality of vector sequences outputted by a last multi-scale convolutional layer as eigenvector sequences, to obtain eigenvector sequence groups respectively corresponding to the texts; combining eigenvector sequences in each eigenvector sequence group to generate a combined eigenvector sequence; and analyzing the generated combined eigenvector sequences to determine whether the first text and the second text pass a similarity check. The embodiment improves the flexibility in checking a text.
-
-
-
-