gpt4 book ai didi

python-3.x - 如何在 XGBooost 中使用 early_stopping_rounds 参数

转载 作者:行者123 更新时间:2023-12-04 03:02:35 24 4
gpt4 key购买 nike

我正在 Datacamp Extreme Gradient Boosting with XGBoost 上进行教程,但我对一个结果有些困惑。

执行以下代码时

# Create your housing DMatrix: 
housing_dmatrix = xgb.DMatrix(data=data, label=y)

# Create the parameter dictionary for each tree: params
params = {"objective":"reg:linear", "max_depth":4}

# Perform cross-validation with early stopping: cv_results
cv_results = xgb.cv(dtrain=housing_dmatrix,params=params,nfold=3, num_boost_round=50, early_stopping_rounds=10, metrics="rmse", as_pandas=True, seed=123)

# Print cv_results
print(cv_results)

mean_mae = cv_results['test-rmse-mean'].min()
boost_rounds = cv_results['test-rmse-mean'].idxmin()
print("\tRMSE {} for {} rounds".format(mean_mae, boost_rounds))

我得到这个输出:
    test-rmse-mean  test-rmse-std  train-rmse-mean  train-rmse-std
0 142644.104167 705.732300 141861.109375 396.179855
1 104867.638021 109.049658 103035.130209 47.104957
2 79261.453125 573.946485 75944.486979 264.292182
3 61684.972656 1010.777141 57339.683594 525.791040
4 50591.589844 1796.898432 44487.089844 597.177883
5 43296.019531 1826.395300 35787.472656 722.298168
6 38898.184896 1751.704546 29919.669922 691.136345
7 36395.877604 1678.547627 26036.014974 643.247157
8 34805.308594 1738.570391 23456.447266 735.406007
9 33709.807292 1476.484849 21628.860677 744.936771
10 33272.565104 1321.658776 20267.642578 720.847179
11 32559.872396 1067.106826 19335.697917 583.088133
12 32151.852865 988.789741 18699.406901 577.292625
13 32147.481120 727.223369 18167.764974 552.016786
14 31866.852865 816.375161 17630.004557 540.065450
15 31786.338542 719.051429 17277.742187 521.709410
16 31541.460287 747.203840 16895.608724 416.004588
17 31397.618490 754.207664 16581.137370 410.450468
18 31445.539714 822.826225 16352.770508 469.756961
19 31452.218099 876.353819 16117.928060 450.584992
20 31385.645182 829.827774 15862.508464 384.789582
21 31296.314453 881.374220 15542.022461 410.365996
22 31290.126953 875.634123 15261.000325 382.412151
23 31252.852865 782.195943 15008.689128 490.904289
24 31210.020182 753.773505 14853.276042 531.599930
RMSE 31210.020182333334 for 24 rounds

通过将 num_boost_round 从 10 更改为 20 ,我得到了这个新输出
    test-rmse-mean  test-rmse-std  train-rmse-mean  train-rmse-std
0 142644.104167 705.732300 141861.109375 396.179855
1 104867.638021 109.049658 103035.130209 47.104957
2 79261.453125 573.946485 75944.486979 264.292182
3 61684.972656 1010.777141 57339.683594 525.791040
4 50591.589844 1796.898432 44487.089844 597.177883
5 43296.019531 1826.395300 35787.472656 722.298168
6 38898.184896 1751.704546 29919.669922 691.136345
7 36395.877604 1678.547627 26036.014974 643.247157
8 34805.308594 1738.570391 23456.447266 735.406007
9 33709.807292 1476.484849 21628.860677 744.936771
10 33272.565104 1321.658776 20267.642578 720.847179
11 32559.872396 1067.106826 19335.697917 583.088133
12 32151.852865 988.789741 18699.406901 577.292625
13 32147.481120 727.223369 18167.764974 552.016786
14 31866.852865 816.375161 17630.004557 540.065450
15 31786.338542 719.051429 17277.742187 521.709410
16 31541.460287 747.203840 16895.608724 416.004588
17 31397.618490 754.207664 16581.137370 410.450468
18 31445.539714 822.826225 16352.770508 469.756961
19 31452.218099 876.353819 16117.928060 450.584992
20 31385.645182 829.827774 15862.508464 384.789582
21 31296.314453 881.374220 15542.022461 410.365996
22 31290.126953 875.634123 15261.000325 382.412151
23 31252.852865 782.195943 15008.689128 490.904289
24 31210.020182 753.773505 14853.276042 531.599930
25 31351.985677 675.589696 14622.553711 484.633075
26 31355.926432 698.751442 14430.575195 619.787710
27 31389.998698 648.971218 14204.662760 568.655261
28 31356.138021 711.192347 14013.201498 546.941784
29 31292.576172 697.337663 13823.481445 581.756303
30 31306.197266 751.915534 13579.544271 599.841587
31 31285.968099 724.867998 13361.774739 631.805694
32 31299.003255 684.342196 13251.625325 674.017166
33 31287.397786 720.993414 13034.457031 749.805941
34 31249.546224 705.225396 12844.346354 691.202512
35 31203.748698 733.094000 12618.286133 624.664883
36 31173.789063 733.643603 12495.182292 609.643329
37 31172.117839 669.111951 12387.871745 607.720998
38 31170.850912 663.908881 12240.325846 583.204898
39 31162.634765 687.274839 12082.383463 555.608639
40 31142.460287 712.869918 11930.459310 589.554611
41 31160.911459 709.512382 11804.457357 614.642441
42 31175.446615 652.927064 11729.721680 618.773653
43 31166.930339 642.093634 11624.710612 623.662847
44 31135.966797 650.693105 11415.508138 612.949992
45 31155.915365 677.580853 11300.589193 651.227389
46 31158.777344 622.338647 11231.110026 663.325455
47 31161.392578 646.070369 11111.234049 679.541984
48 31153.126302 623.746479 11038.423828 677.875949
49 31137.419922 628.337573 10952.264323 653.939810
RMSE 31135.966797 for 44 rounds
  • 我的理解是,对于第二个输出,执行所有 50 个步骤,然后检索最小值。这样对吗?
  • 此外,我的理解是第一个输出与第二个输出相同,只是它在 24 轮后停止。这样对吗?

  • 谢谢大家。

    最佳答案

  • 你是对的。因为你设置了rmse作为您的指标,但未设置 maximize = TRUE ,XGBoost 将返回分配的轮次中具有最低 RMSE 的轮次。
  • 这也是正确的。如果您设置 early_stopping_rounds = n , XGBoost 将在到达 num_boost_round 之前停止如果它走了n在没有改进度量的情况下进行轮次。

  • 请考虑包含一个示例数据集,以便此示例可重现,从而对 future 的读者更有用。

    关于python-3.x - 如何在 XGBooost 中使用 early_stopping_rounds 参数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47854217/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com