gpt4 book ai didi

tensorflow - 相同的神经网络权重不应该产生相同的结果吗?

转载 作者:行者123 更新时间:2023-12-04 15:17:19 25 4
gpt4 key购买 nike

因此,作为我研究的一部分,我正在使用不同的深度学习框架并观察到一些奇怪的东西(至少我无法解释它的原因)。

我在 Tensorflow 中训练了一个相当简单的 MLP 模型(在 mnist 数据集上),提取了经过训练的权重,在 PyTorch 中创建了相同的模型架构并将经过训练的权重应用于 PyTorch 模型。现在我的期望是从 Tensorflow 和 PyTorch 模型中获得相同的测试精度,但事实并非如此。我得到了不同的结果。

所以我的问题是:如果一个模型被训练到某个最佳值,那么每次对同一数据集进行测试时(无论使用何种框架)训练的权重是否应该产生相同的结果?

PyTorch 模型:

class Net(nn.Module):

def __init__(self) -> None:
super(Net, self).__init__()
self.fc1 = nn.Linear(784, 24)
self.fc2 = nn.Linear(24, 10)

def forward(self, x: Tensor) -> Tensor:
x = torch.flatten(x, 1)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x

Tensorflow 模型:

def build_model() -> tf.keras.Model:
# Build model layers
model = models.Sequential()
# Flatten Layer
model.add(layers.Flatten(input_shape=(28,28)))
# Fully connected layer
model.add(layers.Dense(24, activation='relu'))
model.add(layers.Dense(10))
# compile the model
model.compile(
optimizer='sgd',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy']
)
# return newly built model
return model

为了从 Tensorflow 模型中提取权重并将它们应用于 Pytorch 模型,我使用了以下函数:

提取权重:

def get_weights(model):
# fetch latest weights
weights = model.get_weights()
# transpose weights
t_weights = []
for w in weights:
t_weights.append(np.transpose(w))
# return
return t_weights

应用权重:

def set_weights(model, weights):
"""Set model weights from a list of NumPy ndarrays."""
state_dict = OrderedDict(
{k: torch.Tensor(v) for k, v in zip(model.state_dict().keys(), weights)}
)
self.load_state_dict(state_dict, strict=True)

最佳答案

为了社区的利益在答案部分提供解决方案。来自评论

If you are using the same weights in the same manner then resultsshould be the same, though float rounding error should also beaccounted. Also it doesn't matter if model is trained at all. You canthink of your model architecture as a chain of matrix multiplicationswith element-wise nonlinearities in between. How big is thedifference? Are you comparing model outputs, our metrics computed overdataset? As a suggestion, intialize model with some random values inKeras, do a forward pass for a single batch (paraphrased from jdehesa and Taras Sereda)

关于tensorflow - 相同的神经网络权重不应该产生相同的结果吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64099580/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com