-
公开(公告)号:US11176469B2
公开(公告)日:2021-11-16
申请号:US17244811
申请日:2021-04-29
Applicant: Advanced New Technologies Co., Ltd.
Inventor: Chaochao Chen , Liang Li , Jun Zhou
Abstract: A first training participant performs an iterative process until a predetermined condition is satisfied, where the iterative process includes: obtaining, using secret sharing matrix addition and based on the current sub-model of each training participant and a corresponding feature sample subset of each training participant, a current prediction value of the regression model for a feature sample set, where the corresponding feature sample subset of each training participant is obtained by performing vertical segmentation on the feature sample set; determining a prediction difference between the current prediction value and a label corresponding to the current prediction value; sending the prediction difference to each second training participant; and updating a current sub-model of the first training participant based on the current sub-model of the first training participant and a product of a corresponding feature sample subset of the first training participant and the prediction difference.
-
公开(公告)号:US10902332B2
公开(公告)日:2021-01-26
申请号:US16725589
申请日:2019-12-23
Applicant: Advanced New Technologies Co., Ltd.
Inventor: Chaochao Chen , Jun Zhou
Abstract: A client device determines a local user gradient value based on a current user preference vector and a local item gradient value based on a current item feature vector. The client device updates a user preference vector by using the local user gradient value and updates an item feature vector by using the local item gradient value. The client device determines a neighboring client device based on a predetermined adjacency relationship. The local item gradient value is sent by the client device to the neighboring client device. The client device receives a neighboring item gradient value sent by the neighboring client device. The client device updates the item feature vector by using the neighboring item gradient value. In response to the client device determining that a predetermined iteration stop condition is satisfied, the client device outputs the user preference vector and the item feature vector.
-
公开(公告)号:US10901971B2
公开(公告)日:2021-01-26
申请号:US16736603
申请日:2020-01-07
Applicant: ADVANCED NEW TECHNOLOGIES CO., LTD.
Inventor: Shaosheng Cao , Xinxing Yang , Jun Zhou
IPC: G06F16/00 , G06F16/22 , G06F16/27 , G06F16/901 , H04L29/08
Abstract: Embodiments of the present specification disclose random walking and a cluster-based random walking method, apparatus and device. A solution includes: obtaining information about each node included in graph data, generating, according to the information about each node, a hash table reflecting a correspondence between the node and an adjacent node of the node, and generating a random sequence according to the hash table, to implement random walking in the graph data. The solution is applicable to clusters and single machines.
-
公开(公告)号:US11226993B2
公开(公告)日:2022-01-18
申请号:US16684831
申请日:2019-11-15
Applicant: ADVANCED NEW TECHNOLOGIES CO., LTD.
Inventor: Jun Zhou , Xiaolong Li
IPC: G06F16/28 , G06F16/2455 , G06F16/2458 , G06F16/21 , G06F16/27 , G06N7/00
Abstract: Provided is a method for clustering a data stream. The method comprises: acquiring a plurality of resulting models of a plurality of preceding data partitions prior to a current data partition in a data stream, wherein data partitions in the data stream have a temporal relationship, and wherein each of the plurality of resulting models is generated according to a clustering result of a corresponding preceding data partition, and each of the plurality of resulting models comprises one or more representative parameters in different categories; determining a starting model of the current data partition according to the plurality of resulting models, wherein the starting model comprises one or more representative parameters in different categories determined based on representative parameters of the same category in the plurality of resulting models; and clustering data records in the current data partition by using the starting model.
-
公开(公告)号:US11205129B2
公开(公告)日:2021-12-21
申请号:US16889695
申请日:2020-06-01
Applicant: Advanced New Technologies Co., Ltd.
Inventor: Wenjing Fang , Jun Zhou , Licui Gao
Abstract: Implementations of the present specification disclose methods, devices, and apparatuses for determining a feature interpretation of a predicted label value of a user generated by a GBDT model. In one aspect, the method includes separately obtaining, from each of a predetermined quantity of decision trees ranked among top decision trees, a leaf node and a score of the leaf node; determining a respective prediction path of each leaf node; obtaining, for each parent node on each prediction path, a split feature and a score of the parent node; determining, for each child node on each prediction path, a feature corresponding to the child node and a local increment of the feature on the child node; obtaining a collection of features respectively corresponding to the child nodes; and obtaining a respective measure of relevance between the feature corresponding to the at least one child node and the predicted label value.
-
公开(公告)号:US11106804B2
公开(公告)日:2021-08-31
申请号:US16720931
申请日:2019-12-19
Applicant: Advanced New Technologies Co., Ltd.
Inventor: Peilin Zhao , Jun Zhou , Xiaolong Li , Longfei Li
Abstract: Techniques for data sharing between a data miner and a data provider are provided. A set of public parameters is downloaded from the data miner. The public parameters are data miner parameters associated with a feature set of training sample data. A set of private parameters in the data provider can be replaced with the set of public parameters. The private parameters are data provider parameters associated with the feature set of training sample data. The private parameters are updated to provide a set of update results. The private parameters are updated based on a model parameter update algorithm associated with the data provider. The update results is uploaded to the data miner.
-
公开(公告)号:US11257007B2
公开(公告)日:2022-02-22
申请号:US16587977
申请日:2019-09-30
Applicant: Advanced New Technologies Co., Ltd.
Inventor: Xinxing Yang , Shaosheng Cao , Jun Zhou , Xiaolong Li
Abstract: An N×M dimensional target matrix is generated based on N data samples and M dimensional data features respectively corresponding to the N data samples. Encryption calculation is performed on the N×M dimensional target matrix based on a Principal Component Analysis (PCA) algorithm to obtain an N×K dimensional encryption matrix K is less than M. The N×K dimensional encryption matrix is transmitted to a modeling server. The modeling server trains a machine learning model by using the N×K dimensional encryption matrix as a training sample.
-
公开(公告)号:US11157818B2
公开(公告)日:2021-10-26
申请号:US17158451
申请日:2021-01-26
Applicant: Advanced New Technologies Co., Ltd.
Inventor: Chaochao Chen , Jun Zhou
Abstract: Disclosed are a model training method and apparatus based on gradient boosting decision tree (GBDT). A GBDT algorithm flow is divided into two stages. In the first stage, labeled samples are obtained from a data domain of a service scenario similar to a target service scenario to sequentially train several decision trees, and training residual generated after the training in the first stage is determined; in the second stage, labeled samples are obtained from a data domain of the target service scenario, and several decision trees continue to be trained based on the training residual. Finally, a model applied to the target service scenario is actually obtained by integrating the decision trees trained in the first stage with the decision trees trained in the second stage.
-
-
-
-
-
-
-