gpt4 book ai didi

python - 解释 DecisionTreeRegressor 分数?

转载 作者:行者123 更新时间:2023-11-30 08:59:25 25 4
gpt4 key购买 nike

我正在尝试评估特征的相关性,并且我正在使用DecisionTreeRegressor()

相关部分代码如下:

# TODO: Make a copy of the DataFrame, using the 'drop' function to drop the given feature
new_data = data.drop(['Frozen'], axis = 1)

# TODO: Split the data into training and testing sets(0.25) using the given feature as the target
# TODO: Set a random state.

from sklearn.model_selection import train_test_split


X_train, X_test, y_train, y_test = train_test_split(new_data, data['Frozen'], test_size = 0.25, random_state = 1)

# TODO: Create a decision tree regressor and fit it to the training set

from sklearn.tree import DecisionTreeRegressor


regressor = DecisionTreeRegressor(random_state=1)
regressor.fit(X_train, y_train)

# TODO: Report the score of the prediction using the testing set

from sklearn.model_selection import cross_val_score


#score = cross_val_score(regressor, X_test, y_test)
score = regressor.score(X_test, y_test)

print score # python 2.x

当我运行 print 函数时,它返回给定的分数:

-0.649574327334

您可以在下面找到分数函数的实现和一些解释 here及以下:

Returns the coefficient of determination R^2 of the prediction. ... The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse).

我还无法掌握整个概念,所以这个解释对我来说没有多大帮助。例如,我无法理解为什么分数可能是负数以及它到底表示什么(如果某个东西是平方的,我希望它只能是正数)。

<小时/>

这个分数表明什么以及为什么会是负数?

如果您知道任何文章(对于初学者),它也可能会有所帮助!

最佳答案

如果模型对数据的拟合程度比水平线差,则

R^2 从其定义来看可能为负值 ( https://en.wikipedia.org/wiki/Coefficient_of_determination )。基本上

R^2 = 1 - SS_res/SS_tot

SS_resSS_tot始终为正。如果 SS_res >> SS_tot,则 R^2 为负。也看看这个答案:https://stats.stackexchange.com/questions/12900/when-is-r-squared-negative

关于python - 解释 DecisionTreeRegressor 分数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46139186/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com