RANKED FACTOR SELECTION FOR MACHINE LEARNING MODEL

    公开(公告)号:US20220222483A1

    公开(公告)日:2022-07-14

    申请号:US17576040

    申请日:2022-01-14

    Abstract: Described herein are techniques to a systematic approach to reduce the number of factors of an input dataset that impact a target prediction of a trained ML model. The techniques include obtaining a dataset of typed data points and ascertaining the factors of the data points based, at least in part, on the datatypes of the data points. The techniques also include obtaining an indicator of correlation of each factor ascertained in the dataset to a target prediction by a trained ML model and assigning a score to each respective factor ascertained in the dataset based on the indicator of correlation of each factor. The techniques further include ranking the factors ascertained in the dataset based on the score of each factor, selecting factors from the factors ascertained in the dataset, and providing the selected factors for making the target prediction by the trained ML model.

    Method for Automatic Detection of Pair-Wise Interaction Effects Among Large Number of Variables

    公开(公告)号:US20240160696A1

    公开(公告)日:2024-05-16

    申请号:US18241713

    申请日:2023-09-01

    CPC classification number: G06F18/2113 G06F17/18 G06F18/27

    Abstract: Techniques for automatically detecting pair-wise interaction effects among a large number of variables are provided. An example method includes obtaining a data set including data related to a target variable and each of a plurality of variables upon which the target variable depends; grouping the data related to each variable, of the plurality of variables, into a pre-determined number of groups of grouped variable values; analyzing the grouped variable values related to each variable as compared to the grouped variable values related to each other variable, of the plurality of variables, in order to determine a grouped variable interaction score for each pair of variables, of the plurality of variables; and identifying a pre-determined number of pairs of variables having the highest interaction scores, based on the grouped variable interaction score for each pair of variables.

Patent Agency Ranking