gpt4 book ai didi

java - 梯度下降的铰链损失最小化的正确实现

转载 作者:行者123 更新时间:2023-11-30 09:12:38 26 4
gpt4 key购买 nike

我从 here 复制了铰链损失函数(还有它所基于的 LossC 和 LossFunc。然后我将它包含在我的梯度下降算法中,如下所示:

  do 
{
iteration++;
error = 0.0;
cost = 0.0;

//loop through all instances (complete one epoch)
for (p = 0; p < number_of_files__train; p++)
{

// 1. Calculate the hypothesis h = X * theta
hypothesis = calculateHypothesis( theta, feature_matrix__train, p, globo_dict_size );

// 2. Calculate the loss = h - y and maybe the squared cost (loss^2)/2m
//cost = hypothesis - outputs__train[p];
cost = HingeLoss.loss(hypothesis, outputs__train[p]);
System.out.println( "cost " + cost );

// 3. Calculate the gradient = X' * loss / m
gradient = calculateGradent( theta, feature_matrix__train, p, globo_dict_size, cost, number_of_files__train);

// 4. Update the parameters theta = theta - alpha * gradient
for (int i = 0; i < globo_dict_size; i++)
{
theta[i] = theta[i] - LEARNING_RATE * gradient[i];
}

}

//summation of squared error (error value for all instances)
error += (cost*cost);

/* Root Mean Squared Error */
//System.out.println("Iteration " + iteration + " : RMSE = " + Math.sqrt( error/number_of_files__train ) );
System.out.println("Iteration " + iteration + " : RMSE = " + Math.sqrt( error/number_of_files__train ) );

}
while( error != 0 );

但这根本不起作用。这是由于损失函数吗?也许我如何将损失函数添加到我的代码中?

我想我的梯度下降的实现也可能是错误的。

这是我计算梯度和假设的方法,对吗?

static double calculateHypothesis( double[] theta, double[][] feature_matrix, int file_index, int globo_dict_size )
{
double hypothesis = 0.0;

for (int i = 0; i < globo_dict_size; i++)
{
hypothesis += ( theta[i] * feature_matrix[file_index][i] );
}
//bias
hypothesis += theta[ globo_dict_size ];

return hypothesis;
}

static double[] calculateGradent( double theta[], double[][] feature_matrix, int file_index, int globo_dict_size, double cost, int number_of_files__train)
{
double m = number_of_files__train;

double[] gradient = new double[ globo_dict_size];//one for bias?

for (int i = 0; i < gradient.length; i++)
{
gradient[i] = (1.0/m) * cost * feature_matrix[ file_index ][ i ] ;
}

return gradient;
}

其余代码为here如果你有兴趣看一下。

这句话下面是那些损失函数的样子。我应该使用 loss 还是 deriv,这些是否正确?

/**
* Computes the HingeLoss loss
*
* @param pred the predicted value
* @param y the target value
* @return the HingeLoss loss
*/
public static double loss(double pred, double y)
{
return Math.max(0, 1 - y * pred);
}

/**
* Computes the first derivative of the HingeLoss loss
*
* @param pred the predicted value
* @param y the target value
* @return the first derivative of the HingeLoss loss
*/
public static double deriv(double pred, double y)
{
if (pred * y > 1)
return 0;
else
return -y;
}

最佳答案

您提供的梯度代码看起来不像铰链损失的梯度。看一下有效的方程,例如: https://stats.stackexchange.com/questions/4608/gradient-of-hinge-loss

关于java - 梯度下降的铰链损失最小化的正确实现,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28988732/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com