Distributable feature analysis and tree model training system
摘要:
A computing system computes a variable relevance using a trained tree model. (A) A next child node is selected. (B) A number of observations associated with the next child node is computed. (C) A population ratio value is computed. (D) A next leaf node is selected. (E) First observations are identified. (F) A first impurity value is computed for the first observations. (G) Second observations are identified when the first observations are associated with the descending child nodes. (H) A second impurity value is computed for the second observations. (I) A gain contribution is computed. (J) A node gain value is updated. (K) (D) through (J) are repeated. (L) A variable gain value is updated for a variable associated with the split test. (M) (A) through (L) are repeated. (N) A set of relevant variables is selected based on the variable gain value.
信息查询
0/0