gpt4 book ai didi

python - sklearn.tree.DecisionTreeClassifier : Get all samples that fell into leaf node

转载 作者:行者123 更新时间:2023-11-28 17:03:02 25 4
gpt4 key购买 nike

我想为所有样本评估它们落入的叶节点的大小。

基于 this excellent answer ,我已经想出了一个方法来提取每个叶节点的样本数:

from sklearn.tree import _tree, DecisionTreeClassifier
import numpy as np

clf = DecisionTreeClassifier().fit(X_train, y_train)

def tree_get_leaf_size_for_elem(tree, feature_names):

tree_ = tree.tree_

def recurse(node):
if tree_.feature[node] != _tree.TREE_UNDEFINED:
recurse(tree_.children_left[node])
else:
samples_in_leaf = np.sum(tree_.value[node][0])

recurse(0)

tree_get_leaf_size_for_elem(clf, feature_names)

有没有办法获取最终在叶节点中的所有样本 (X_train) 的索引?名为“leaf_node_size”的 X_train 的新列将是所需的输出。

最佳答案

sklearn 允许您通过 apply 方法轻松地做到这一点

from collections import Counter

#get the leaf for each training sample
leaves_index = tree.apply(X_train)

#use Counter to find the number of elements on each leaf
cnt = Counter( leaves_index )

#and now you can index each input to get the number of elements
elems = [ cnt[x] for x in leaves_index ]

关于python - sklearn.tree.DecisionTreeClassifier : Get all samples that fell into leaf node,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53079393/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com