gpt4 book ai didi

python - 在检查决策路径时从 sklearn 随机森林复制特征和标准

转载 作者:行者123 更新时间:2023-12-05 04:19:23 25 4
gpt4 key购买 nike

在检查随机森林模型的决策树时,我得到了重复的特征和阈值 (CO2)。可视化树的代码如下:

estimator = model.estimators_[10]
from sklearn.tree import export_graphviz
# Export as dot file
export_graphviz(estimator, out_file='tree.dot',
feature_names = ['pdo', 'pna', 'lat', 'lon', 'ele', 'co2'],
class_names = 'disWY',
rounded = False, proportion = False,
precision = 3, filled = True)

# Convert to png using system command (requires Graphviz)
from subprocess import call
call(['dot', '-Tpng', 'tree.dot', '-o', 'tree.png', '-Gdpi=300'])

# Display in jupyter notebook
from IPython.display import Image
Image(filename = 'tree.png')

很明显,CO2 和 -0.69 被使用了两次。我不明白这怎么可能。有人有什么想法吗?

screen shot of decision tree

同一个特征是否应该有不同的阈值?

最佳答案

这可能是一个舍入错误。

这有点做作,但这是使用 RandomForestRegressor 重现它的最小方法

import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.tree import export_graphviz

X = np.array([[-0.6901, 4.123],
[-0.6902, 5.456],
[-0.6903, 6.789],
[-0.6904, 7.012]])
y = np.array([0.0, 1.0, 1.0, 0.0])

reg = RandomForestRegressor(random_state=42).fit(X, y)

export_graphviz(reg.estimators_[6], out_file=f"tree6.dot", precision=3, filled=True)
# dot -Tpng tree6.dot -o tree6.png

Decision tree, showing that feature 0 is split on twice when the value is less than or equal to -0.69

如果我们在调用 export_graphviz() 时传递更高的 precision=8,我们将看到如下内容:

The same tree with higher precision in floating point numbers. With higher precision, it appears that the error was caused by limited precision when exporting a graphviz representation.

关于python - 在检查决策路径时从 sklearn 随机森林复制特征和标准,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74802875/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com