-
31.
公开(公告)号:US11620568B2
公开(公告)日:2023-04-04
申请号:US16388830
申请日:2019-04-18
Applicant: Oracle International Corporation
Inventor: Hesam Fathi Moghadam , Sandeep Agrawal , Venkatanathan Varadarajan , Anatoly Yakovlev , Sam Idicula , Nipun Agarwal
Abstract: Techniques are provided for selection of machine learning algorithms based on performance predictions by using hyperparameter predictors. In an embodiment, for each mini-machine learning model (MML model), a respective hyperparameter predictor set that predicts a respective set of hyperparameter settings for a data set is trained. Each MML model represents a respective reference machine learning model (RML model). Data set samples are generated from the data set. Meta-feature sets are generated, each meta-feature set describing a respective data set sample. A respective target set of hyperparameter settings are generated for said each MML model using a hypertuning algorithm. The meta-feature sets and the respective target set of hyperparameter settings are used to train the respective hyperparameter predictor set. Each hyperparameter predictor set is used during training and inference to improve the accuracy of automatically selecting a RML model per data set.
-
公开(公告)号:US11429895B2
公开(公告)日:2022-08-30
申请号:US16384588
申请日:2019-04-15
Applicant: Oracle International Corporation
Inventor: Anatoly Yakovlev , Venkatanathan Varadarajan , Sandeep Agrawal , Hesam Fathi Moghadam , Sam Idicula , Nipun Agarwal
IPC: G06N20/00
Abstract: Herein are techniques for exploring hyperparameters of a machine learning model (MLM) and to train a regressor to predict a time needed to train the MLM based on a hyperparameter configuration and a dataset. In an embodiment that is deployed in production inferencing mode, for each landmark configuration, each containing values for hyperparameters of a MLM, a computer configures the MLM based on the landmark configuration and measures time spent training the MLM on a dataset. An already trained regressor predicts time needed to train the MLM based on a proposed configuration of the MLM, dataset meta-feature values, and training durations and hyperparameter values of landmark configurations of the MLM. When instead in training mode, a regressor in training ingests a training corpus of MLM performance history to learn, by reinforcement, to predict a training time for the MLM for new datasets and/or new hyperparameter configurations.
-
33.
公开(公告)号:US20220138504A1
公开(公告)日:2022-05-05
申请号:US17083536
申请日:2020-10-29
Applicant: Oracle International Corporation
Inventor: Hesam Fathi Moghadam , Anatoly Yakovlev , Sandeep Agrawal , Venkatanathan Varadarajan , Robert Hopkins , Matteo Casserini , Milos Vasic , Sanjay Jinturkar , Nipun Agarwal
Abstract: In an embodiment based on computer(s), an ML model is trained to detect outliers. The ML model calculates anomaly scores that include a respective anomaly score for each item in a validation dataset. The anomaly scores are automatically organized by sorting and/or clustering. Based on the organized anomaly scores, a separation is measured that indicates fitness of the ML model. In an embodiment, a computer performs two-clustering of anomaly scores into a first organization that consists of a first normal cluster of anomaly scores and a first anomaly cluster of anomaly scores. The computer performs three-clustering of the same anomaly scores into a second organization that consists of a second normal cluster of anomaly scores, a second anomaly cluster of anomaly scores, and a middle cluster of anomaly scores. A distribution difference between the first organization and the second organization is measured. An ML model is processed based on the distribution difference.
-
公开(公告)号:US20210390466A1
公开(公告)日:2021-12-16
申请号:US17086204
申请日:2020-10-30
Applicant: Oracle International Corporation
Inventor: Venkatanathan Varadarajan , Sandeep R. Agrawal , Hesam Fathi Moghadam , Anatoly Yakovlev , Ali Moharrer , Jingxiao Cai , Sanjay Jinturkar , Nipun Agarwal , Sam Idicula , Nikan Chavoshi
Abstract: A proxy-based automatic non-iterative machine learning (PANI-ML) pipeline is described, which predicts machine learning model configuration performance and outputs an automatically-configured machine learning model for a target training dataset. Techniques described herein use one or more proxy models—which implement a variety of machine learning algorithms and are pre-configured with tuned hyperparameters—to estimate relative performance of machine learning model configuration parameters at various stages of the PANI-ML pipeline. The PANI-ML pipeline implements a radically new approach of rapidly narrowing the search space for machine learning model configuration parameters by performing algorithm selection followed by algorithm-specific adaptive data reduction (i.e., row- and/or feature-wise dataset sampling), and then hyperparameter tuning. Furthermore, because of the one-pass nature of the PANI-ML pipeline and because each stage of the pipeline has convergence criteria by design, the whole PANI-ML pipeline has a novel convergence property that stops the configuration search after one pass.
-
公开(公告)号:US10353670B2
公开(公告)日:2019-07-16
申请号:US15661900
申请日:2017-07-27
Applicant: Oracle International Corporation
Inventor: Jeffrey S. Brooks , Christopher H. Olson , Hesam Fathi Moghadam , Josephus C. Ebergen
Abstract: Embodiments of a processor are disclosed for performing arithmetic operations on a machine independent number format. The processor may include a floating point unit, and a number unit. The number format may include a sign/exponent block, a length block, and multiple mantissa digits. The number unit may be configured to perform an operation on two operands by converting the digit format of each mantissa digit of each operand, to perform the operation using the converted mantissa digits, and then to convert each mantissa digit of the result of the operation back into the original digit format.
-
-
-
-