gpt4 book ai didi

python - class_weights 如何应用于 sklearn 逻辑回归?

转载 作者:太空狗 更新时间:2023-10-30 01:31:33 25 4
gpt4 key购买 nike

我对 sklearn 如何应用我们提供的类权重感兴趣。 documentation没有明确说明类权重的应用位置和方式。阅读源代码也没有帮助(似乎 sklearn.svm.liblinear 用于优化,我无法阅读源代码,因为它是一个 .pyd 文件...)

但我猜它适用于成本函数:指定类别权重时,相应类别的成本将乘以类别权重。例如,如果我分别从 0 类(权重=0.5)和 1 类(权重=1)分别有 2 个观察值,那么成本函数将是:

Cost = 0.5*log(...X_0,y_0...) + 1*log(...X_1,y_1...) + penalization

有谁知道这是否正确?

最佳答案

检查 the following lines in the source code :

le = LabelEncoder()
if isinstance(class_weight, dict) or multi_class == 'multinomial':
class_weight_ = compute_class_weight(class_weight, classes, y)
sample_weight *= class_weight_[le.fit_transform(y)]

Here is the source code for the compute_class_weight() function :

...
else:
# user-defined dictionary
weight = np.ones(classes.shape[0], dtype=np.float64, order='C')
if not isinstance(class_weight, dict):
raise ValueError("class_weight must be dict, 'balanced', or None,"
" got: %r" % class_weight)
for c in class_weight:
i = np.searchsorted(classes, c)
if i >= len(classes) or classes[i] != c:
raise ValueError("Class label {} not present.".format(c))
else:
weight[i] = class_weight[c]
...

在上面的代码片段中,class_weight 被应用于 sample_weight,它被用于一些内部函数,如 _logistic_loss_and_grad , _logistic_loss等:

# Logistic loss is the negative of the log of the logistic function.
out = -np.sum(sample_weight * log_logistic(yz)) + .5 * alpha * np.dot(w, w)
# NOTE: ---> ^^^^^^^^^^^^^^^

关于python - class_weights 如何应用于 sklearn 逻辑回归?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50433130/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com