gpt4 book ai didi

python - 我想绘制梯度Boost的误差和树大小

转载 作者:太空宇宙 更新时间:2023-11-03 19:55:31 25 4
gpt4 key购买 nike

我创建了一个包含 700 棵树的 GradientBoostingRegressor,现在我想检查树是否过度拟合。因此我想绘制误差(y 轴)和树大小(100,...,700 x 轴)。但我找不到如何获得每棵树的预测的答案。现在我只有每个数据点的错误。我搜索了好几天,希望有人能帮助我找到答案。谢谢

gb_v2 = GradientBoostingRegressor(
n_estimators = 700,
learning_rate = 0.05,
max_features = None,
max_depth = 5,
min_samples_leaf = 1,
min_samples_split = 2,
random_state = 42,
min_impurity_decrease = 0
)
gb_v2.fit(X_train_v1, y_train_v1);

最佳答案

您可以使用staged_predict获得每棵树的预测。这里有一个简单的例子,关于如何根据sklearn的官方example绘制误差与树数的关系图。 .

import numpy as np
import matplotlib.pyplot as plt

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

params = {'n_estimators': 700, 'random_state': 2}

X, y = make_regression(n_samples=1000)

X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.33, random_state=0)

clf = GradientBoostingRegressor(**params)
clf.fit(X_train, y_train)

# compute test set deviance
test_deviance = np.zeros((params['n_estimators'],), dtype=np.float64)

for i, y_pred in enumerate(clf.staged_predict(X_test)):
test_deviance[i] = clf.loss_(y_test, y_pred)

step_size = 10

plt.plot((np.arange(test_deviance.shape[0]) + 1)[::step_size],
test_deviance[::step_size],
'-')

plt.legend(loc='upper left')
plt.xlabel('Boosting Iterations')
plt.ylabel('Test Set Deviance')

plt.show()

您可以改变step_size来修改绘图的细节。

关于python - 我想绘制梯度Boost的误差和树大小,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59561371/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com