-
公开(公告)号:US20220245448A1
公开(公告)日:2022-08-04
申请号:US17195116
申请日:2021-03-08
Applicant: EMC IP Holding Company LLC
Inventor: Jiacheng Ni , Qiang Chen , Zijia Wang , Zhen Jia
Abstract: Embodiments of the present disclosure provide a method, a device, and a computer program product for updating a model. The method includes: determining a performance metric of a trained machine learning model at runtime; determining a homogeneity degree between a verification data set processed by the machine learning model at runtime and a training data set used to train the machine learning model; determining a type of a conceptual drift of the machine learning model based on the performance metric and the homogeneity degree; and performing an update of the machine learning model based on the type of the conceptual drift, where the update includes a partial update or a global update. In this way, a desired performance of the machine learning model can be maintained, while avoiding excessive time costs and computational resource costs caused by frequent global updates.
-
公开(公告)号:US12182516B2
公开(公告)日:2024-12-31
申请号:US17527798
申请日:2021-11-16
Applicant: EMC IP Holding Company LLC
Inventor: Zijia Wang , Jiacheng Ni , Zhen Jia
Abstract: Embodiments of the present disclosure relate to a computer-implemented method, a device, and a computer program product. The method includes extracting respective themes of a set of documents with release time within a first period; determining respective semantic information of the themes and frequencies of the themes appearing in the set of documents; and determining the number of documents associated with the themes within a second period according to a prediction model and based on the semantic information and frequencies of the themes. The second period is after the first period. Embodiments of the present disclosure can better predict the tendency of the themes appearing in the future based on the semantic information and frequencies of the themes.
-
公开(公告)号:US20230064850A1
公开(公告)日:2023-03-02
申请号:US17492853
申请日:2021-10-04
Applicant: EMC IP Holding Company LLC
Inventor: Zijia Wang , Jiacheng Ni , Zhen Jia , Wenbin Yang
IPC: G06K9/62
Abstract: Embodiments of the present disclosure provide a method, an electronic device, and a computer program product for analyzing samples. The method includes acquiring a set of feature representations associated with a set of samples. The set of samples illustratively have classification information for indicating classifications of the set of samples. The method further includes adjusting the set of feature representations so that distances between feature representations of samples corresponding to the same classification are less than a first distance threshold. The method further includes training a classification model based on the adjusted set of feature representations and the classification information. The classification model is illustratively configured to receive an input sample and determine a classification of the input sample. In this manner, a relatively accurate classification model can be trained using a small number of samples, thereby reducing computation time and required computation capacity.
-
公开(公告)号:US20230038047A1
公开(公告)日:2023-02-09
申请号:US17405241
申请日:2021-08-18
Applicant: EMC IP Holding Company LLC
Inventor: Zijia Wang , Jiacheng Ni , Zhen Jia , Wenbin Yang
Abstract: Embodiments of the present disclosure relate to a method, a device, and a computer program product for image recognition. In some embodiments, characterization information for a first reference image in a reference image set is generated in an image recognition engine by using a Gaussian mixture model. First reference label information for the first reference image is generated based on the characterization information for the first reference image, the first reference label information being associated with a category of a first object in the first reference image. The image recognition engine is updated by determining the accuracy of the first reference label information for the first reference image. In this way, good characterization of images and generation of reference label information for the images can be achieved, thus both improving the robustness of the generated reference label information and significantly improving the accuracy of image recognition.
-
公开(公告)号:US20230025148A1
公开(公告)日:2023-01-26
申请号:US17402769
申请日:2021-08-16
Applicant: EMC IP Holding Company LLC
Inventor: Jiacheng Ni , Zijia Wang , Jinpeng Liu , Zhen Jia
Abstract: Embodiments of the present disclosure relate to a model optimization method, an electronic device, and a computer program product. This method includes: determining an initial learning rate combination for a deep learning model, wherein the initial learning rate combination includes a plurality of learning rates, each learning rate being determined for one of a plurality of layers of the deep learning model, and the plurality of learning rates including static learning rates and dynamic learning rates; and adjusting the initial learning rate combination to obtain a target learning rate combination, wherein an accuracy rate achieved when the target learning rate combination is used to train the deep learning model is higher than or equal to a first threshold accuracy rate. With the technical solution of the present disclosure, a deep learning model can be optimized by setting learning rates for each layer of the deep learning model.
-
公开(公告)号:US20220343154A1
公开(公告)日:2022-10-27
申请号:US17318568
申请日:2021-05-12
Applicant: EMC IP Holding Company LLC
Inventor: Zijia Wang , Jiacheng Ni , Qiang Chen , Zhen Jia
Abstract: Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for data distillation. The method includes: training an input data set by using a machine learning training process to establish a training model of the input data set; extracting multiple weights from the training model of the input data set, wherein the multiple weights contain information indicating the input data set, and the multiple weights are orthogonal to each other; and retraining the training model by using the multiple weights for generating a reconstructed data set. The embodiments of the present disclosure can greatly reduce the data storage cost of a data storage system and maintain the performance of the data storage system.
-
公开(公告)号:US20220292361A1
公开(公告)日:2022-09-15
申请号:US17230460
申请日:2021-04-14
Applicant: EMC IP Holding Company LLC
Inventor: Zijia Wang , Jiacheng Ni , Qiang Chen , Zhen Jia
Abstract: Embodiments of the present disclosure provide a method, an electronic device, and a computer program product for data processing. In a method for data processing, a first electronic device processes data based on a first data processing model to generate an initial result. A data size of the initial result is smaller than a data size of the data. The first electronic device sends the initial result to a second electronic device. The initial result is adjusted at the second electronic device and based on a second data processing model to generate an adjusted result. The second electronic device has more computing resources than the first electronic device, the second data processing model occupies more computing resources than the first data processing model, and an accuracy of the adjusted result is higher than that of the initial result.
-
公开(公告)号:US20220237045A1
公开(公告)日:2022-07-28
申请号:US17178413
申请日:2021-02-18
Applicant: EMC IP Holding Company LLC
Inventor: Zhen Jia , Zijia Wang
Abstract: A method includes: acquiring a set of operations to be performed on multiple computing units in the computing system; determining, based on the set of operations, the state of the multiple computing units, and an allocation model, an allocation action for allocating the set of operations to the multiple computing units and a reward for the allocation action, wherein the allocation model describes an association relationship among a set of operations, the state of multiple computing units, the allocation action for allocating the set of operations to the multiple computing units, and the reward for the allocation action; receiving an adjustment for the reward in response to determining that a match degree between the reward for the allocation action and a performance index of the computing system after the allocation action is performed satisfies a predetermined condition; and generating, based on the adjustment, training data for updating the allocation model.
-
公开(公告)号:US20220171798A1
公开(公告)日:2022-06-02
申请号:US17146558
申请日:2021-01-12
Applicant: EMC IP Holding Company LLC
Inventor: Zijia Wang , Jiacheng Ni , Zhen Jia , Bo Wei , Chun Xi Chen
Abstract: Embodiments of the present disclosure provide a method, an electronic device, and a computer program product for information processing. In an information processing method, based on multiple weights corresponding to multiple words in text, a computing device determines a target object associated with the text among predetermined multiple objects, and also determines, among the multiple words, a set of key words with respect to the determination of the target object. Next, the computing device determines, among the set of key words, a set of target words related to a text topic of the text. Then, the computing device outputs the set of target words and an identifier of the target object in an associated manner. In this way, the credibility of the target object associated with the text that is determined by the information processing method is improved, thereby improving the user experience of the information processing method.
-
公开(公告)号:US20220138457A1
公开(公告)日:2022-05-05
申请号:US17106551
申请日:2020-11-30
Applicant: EMC IP Holding Company LLC
Inventor: Zijia Wang , Qiang Chen , Jiacheng Ni , Zhen Jia
Abstract: Embodiments of the present disclosure provide a method, a device, and a program product for keystroke pattern analysis. The method includes: acquiring keystroke information of a user on an electronic device, wherein the keystroke information indicates a sequence of characters that are typed sequentially and time information related to the typing of corresponding characters in the sequence of characters; encoding corresponding characters in the sequence of characters respectively into vectorized representations to obtain a sequence of vectorized representations, wherein different characters are encoded into different vectorized representations; superimposing the time information related to the typing of corresponding characters in the sequence of characters respectively to corresponding vectorized representations in the sequence of vectorized representations to obtain a sequence of time-based vectorized representations; and verifying a keystroke pattern of the user by extracting keystroke behavior features from the sequence of time-based vectorized representations.
-
-
-
-
-
-
-
-
-