Decision tree training using a database system

    公开(公告)号:US11934931B2

    公开(公告)日:2024-03-19

    申请号:US16222974

    申请日:2018-12-17

    摘要: In an embodiment, a computer-implemented method for training a decision tree using a database system, the decision tree comprising a plurality nodes, comprises, by one or more computing devices: storing in a database input data for training the decision tree, the input data comprising a plurality of feature values corresponding to a plurality of features; generating a particular node of the plurality of decision nodes by: selecting a subset of the plurality of features and a subset of the input data; using one or more queries to the database system, for each feature of the subset of the plurality of features, calculating an information gain associated with the feature based on the subset of the input data; identifying a particular feature of the subset of the plurality of features associated with the highest information gain; associating the particular node with the particular feature, wherein the particular node causes the decision tree to branch based on the particular feature.