gpt4 book ai didi

machine-learning - 如何使用 scipy.minimize 最小化套索损失函数?

转载 作者:行者123 更新时间:2023-12-04 13:55:55 26 4
gpt4 key购买 nike

主要问题:为什么通过 scipy.minimize 进行最小化,套索回归的系数不会缩小到零?
我正在尝试使用 scipy.minimize 创建套索模型。但是,它仅在 alpha 为零时才起作用(因此仅像基本平方误差一样)。当 alpha 不为零时,它返回更糟糕的结果(更高的损失)并且仍然没有任何系数为零。
我知道套索是不可微的,但是 我尝试使用鲍威尔优化器,它应该处理非差分损失 (我也试过 BFGS,它应该处理非平滑)。这些优化器都没有工作。
为了测试这一点,我创建了数据集,其中 y 是随机的(此处提供可重现),X 的第一个特征恰好是 y*.5,其他四个特征是随机的(此处也提供可重现)。我希望算法将这些随机系数缩小到零并只保留第一个,但这并没有发生。
对于套索损失函数,我使用的是来自 this paper (figure 1, first page) 的公式
我的代码如下:

from scipy.optimize import minimize
import numpy as np

class Lasso:

def _pred(self,X,w):
return np.dot(X,w)

def LossLasso(self,weights,X,y,alpha):
w = weights
yp = self._pred(X,w)
loss = np.linalg.norm(y - yp)**2 + alpha * np.sum(abs(w))
return loss

def fit(self,X,y,alpha=0.0):
initw = np.random.rand(X.shape[1]) #initial weights
res = minimize(self.LossLasso,
initw,
args=(X,y,alpha),
method='Powell')
return res

if __name__=='__main__':
y = np.array([1., 0., 1., 0., 0., 1., 1., 0., 0., 0., 1., 0., 0., 0., 1., 0., 1.,
1., 1., 0.])
X_informative = y.reshape(20,1)*.5
X_noninformative = np.array([[0.94741352, 0.892991 , 0.29387455, 0.30517762],
[0.22743465, 0.66042825, 0.2231239 , 0.16946974],
[0.21918747, 0.94606854, 0.1050368 , 0.13710866],
[0.5236064 , 0.55479259, 0.47711427, 0.59215551],
[0.07061579, 0.80542011, 0.87565747, 0.193524 ],
[0.25345866, 0.78401146, 0.40316495, 0.78759134],
[0.85351906, 0.39682136, 0.74959904, 0.71950502],
[0.383305 , 0.32597392, 0.05472551, 0.16073454],
[0.1151415 , 0.71683239, 0.69560523, 0.89810466],
[0.48769347, 0.58225877, 0.31199272, 0.37562258],
[0.99447288, 0.14605177, 0.61914979, 0.85600544],
[0.78071238, 0.63040498, 0.79964659, 0.97343972],
[0.39570225, 0.15668933, 0.65247826, 0.78343458],
[0.49527699, 0.35968554, 0.6281051 , 0.35479879],
[0.13036737, 0.66529989, 0.38607805, 0.0124732 ],
[0.04186019, 0.13181696, 0.10475994, 0.06046115],
[0.50747742, 0.5022839 , 0.37147486, 0.21679859],
[0.93715221, 0.36066077, 0.72510501, 0.48292022],
[0.47952644, 0.40818585, 0.89012395, 0.20286356],
[0.30201193, 0.07573086, 0.3152038 , 0.49004217]])
X = np.concatenate([X_informative,X_noninformative],axis=1)

#alpha zero
clf = Lasso()
print(clf.fit(X,y,alpha=0.0))

#alpha nonzero
clf = Lasso()
print(clf.fit(X,y,alpha=0.5))
虽然 alpha 0 的输出是正确的:
     fun: 2.1923913945084075e-24
message: 'Optimization terminated successfully.'
nfev: 632
nit: 12
status: 0
success: True
x: array([ 2.00000000e+00, -1.49737205e-13, -5.49916821e-13, 8.87767676e-13,
1.75335824e-13])
alpha 非零的输出具有更高的损失,并且非系数如预期的那样为零:
     fun: 0.9714385008821652
message: 'Optimization terminated successfully.'
nfev: 527
nit: 6
status: 0
success: True
x: array([ 1.86644474e+00, 1.63986381e-02, 2.99944361e-03, 1.64568796e-12,
-6.72908469e-09])
为什么随机特征的系数没有收缩为零,损失这么高?

最佳答案

这是一个可行的选择:

import numpy as np
from sklearn.linear_model import Lasso, Ridge
from sklearn.model_selection import GridSearchCV

y = np.array([1., 0., 1., 0., 0., 1., 1., 0., 0., 0., 1., 0., 0., 0., 1., 0., 1., 1., 1., 0.])
X_informative = y.reshape(20, 1) * .5

X_noninformative = np.array([[0.94741352, 0.892991 , 0.29387455, 0.30517762],
[0.22743465, 0.66042825, 0.2231239 , 0.16946974],
[0.21918747, 0.94606854, 0.1050368 , 0.13710866],
[0.5236064 , 0.55479259, 0.47711427, 0.59215551],
[0.07061579, 0.80542011, 0.87565747, 0.193524 ],
[0.25345866, 0.78401146, 0.40316495, 0.78759134],
[0.85351906, 0.39682136, 0.74959904, 0.71950502],
[0.383305 , 0.32597392, 0.05472551, 0.16073454],
[0.1151415 , 0.71683239, 0.69560523, 0.89810466],
[0.48769347, 0.58225877, 0.31199272, 0.37562258],
[0.99447288, 0.14605177, 0.61914979, 0.85600544],
[0.78071238, 0.63040498, 0.79964659, 0.97343972],
[0.39570225, 0.15668933, 0.65247826, 0.78343458],
[0.49527699, 0.35968554, 0.6281051 , 0.35479879],
[0.13036737, 0.66529989, 0.38607805, 0.0124732 ],
[0.04186019, 0.13181696, 0.10475994, 0.06046115],
[0.50747742, 0.5022839 , 0.37147486, 0.21679859],
[0.93715221, 0.36066077, 0.72510501, 0.48292022],
[0.47952644, 0.40818585, 0.89012395, 0.20286356],
[0.30201193, 0.07573086, 0.3152038 , 0.49004217]])
X = np.concatenate([X_informative,X_noninformative], axis=1)

_lasso = Lasso()
_lasso_parms = {'alpha': [1e-15, 1e-10, 1e-8, 1e-4, 1e-3, 1e-2, 1, 5, 10, 20]}
_lasso_regressor = GridSearchCV(_lasso, _lasso_parms, scoring='neg_mean_squared_error', cv=5)

print('_lasso_regressor.fit(X, y)')
print(_lasso_regressor.fit(X, y))

print("\n=========================================\n")
print('lasso_regressor.best_params_: ')
print(_lasso_regressor.best_params_)
print("\n")
print('lasso_regressor.best_score_: ')
print(_lasso_regressor.best_score_)
print("\n=========================================\n")

_ridge = Ridge()
_ridge_parms = {'alpha': [1e-15, 1e-10, 1e-8, 1e-4, 1e-3, 1e-2, 1, 5, 10, 20]}
_ridge_regressor = GridSearchCV(_ridge, _lasso_parms, scoring='neg_mean_squared_error', cv=5)

print('_ridge_regressor.fit(X, y)')
print(_ridge_regressor.fit(X, y))

print("\n=========================================\n")
print('_ridge_regressor.best_params_: ')
print(_ridge_regressor.best_params_)
print("\n")
print('_ridge_regressor.best_score_: ')
print(_ridge_regressor.best_score_)
print("\n=========================================\n")
和输出:
enter image description here

关于machine-learning - 如何使用 scipy.minimize 最小化套索损失函数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62532926/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com