gpt4 book ai didi

python - Numpy神经网络成本计算: result changes after first run

转载 作者:行者123 更新时间:2023-11-30 09:43:58 30 4
gpt4 key购买 nike

在 python3.7 中,我的神经网络成本计算遇到问题。
当我第一次运行compute_cost_nn时,我得到了正确的成本0.28762916516131887,但在所有后续运行中,成本更改为0.3262751145707298,这非常烦人。< br/>看起来问题来 self 的 params;如果我每次在计算成本之前重新加载它们,它就可以正常工作。但我无法使用不同的参数重新运行该函数并在不再次运行整个脚本的情况下获得正确的成本。

神经网络有 400 个输入单元、1 个包含 25 个单元的隐藏层和 10 个输出单元。

以下是输入:

data = loadmat("ex4data1.mat")
y = data['y']
X = data['X']
X = np.c_[np.ones((X.shape[0], 1)), X]

weights = loadmat("ex4weights.mat")
Theta1 = weights['Theta1']
Theta2 = weights['Theta2']
params = np.r_[Theta1.ravel(), Theta2.ravel()]

矩阵形状:

>> X: (5000, 401)
>> y: (5000, 1)
>> Theta1: (25, 401)
>> Theta2: (10, 26)
>> params: (10285,)

和成本函数:

def compute_cost_nn(params,
input_layer_size,
hidden_layer_size,
num_labels,
X, y, lambda_):

m = len(y)

# Retrieve Theta1 and Theta2 from flattened params
t1_items = (input_layer_size + 1) * hidden_layer_size
Theta1 = params[0:t1_items].reshape(
hidden_layer_size,
input_layer_size+1
)
Theta2 = params[t1_items:].reshape(
num_labels,
hidden_layer_size+1
)

# transform y vector column (5000x1) with labels
# into 5000x10 matrix with 0s and 1s
y_mat = np.eye(num_labels)[(y-1).ravel(), :]

# Forward propagation
a1 = X
z2 = a1 @ Theta1.T
a2 = sigmoid(z2)
a2 = np.c_[np.ones((m,1)), a2]
z3 = a2 @ Theta2.T
a3 = sigmoid(z3)

# Compute cost
func = y_mat.T @ np.log(a3) + (1-y_mat).T @ np.log(1-a3)
cost = func.trace()
t1reg = (Theta1[:,1:].T @ Theta1[:,1:]).trace()
t2reg = (Theta2[:,1:].T @ Theta2[:,1:]).trace()
cost_r = -1/m * cost + lambda_/(2*m) * (t1reg + t2reg)

# Gradients (excluding Theta0)
d3 = a3 - y_mat
d2 = (d3 @ Theta2[:,1:]) * sigmoid_gradient(z2) #5000*25

Delta1 = d2.T @ a1
Delta2 = d3.T @ a2
Theta1_grad = 1/m * Delta1
Theta2_grad = 1/m * Delta2

# Gradient regularization
Theta1[:,1] = 0
Theta2[:,1] = 0
Theta1_grad = Theta1_grad + lambda_/m * Theta1
Theta2_grad = Theta2_grad + lambda_/m * Theta2

return cost_r, Theta1_grad, Theta2_grad

我通过运行获得成本:

compute_cost_nn(params, 400, 25, 10, X, y, 0)[0]

首次运行:0.28762916516131887
然后:0.3262751145707298

非常感谢任何提示:)

最佳答案

我尚未使用虚拟数据测试您的代码,但快速浏览一下,您似乎正在从 .mat (MATLAB) 文件导入权重。 MATLAB 按列优先顺序(也称为 Fortran 样式顺序)存储数组元素,而 Python 则按行优先顺序(C 样式顺序)。

因此,当您第一次 ravel() 权重时,Numpy 会按 C 风格顺序展平数组。当您重新调整函数中困惑的权重时,也会发生同样的情况。您可以将顺序作为参数添加到任一函数,因此:

params = np.r_[Theta1.ravel(order='F'), Theta2.ravel('F')]

应该可以解决您的问题。

如果您从未遇到过,也许可以阅读有关行主序和列主序的内容: https://en.wikipedia.org/wiki/Row-_and_column-major_order

关于python - Numpy神经网络成本计算: result changes after first run,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54932926/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com