neural-network - 微调resnet50时如何卡住一些图层-6ren

neural-network - 微调resnet50时如何卡住一些图层

转载作者：行者123 更新时间：2023-12-04 04:31:56

我正在尝试使用 keras 微调 resnet 50。当我卡住 resnet50 中的所有图层时，一切正常。但是，我想卡住一些 resnet50 层，而不是全部。但是当我这样做时，我得到了一些错误。这是我的代码:

base_model = ResNet50(include_top=False, weights="imagenet", input_shape=(input_size, input_size, input_channels))
model = Sequential()
model.add(base_model)
model.add(Flatten())
model.add(Dense(80, activation="softmax"))

#this is where the error happens. The commented code works fine
"""
for layer in base_model.layers:
    layer.trainable = False
"""
for layer in base_model.layers[:-26]:
    layer.trainable = False
model.summary()
optimizer = Adam(lr=1e-4)
model.compile(loss="categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"])

callbacks = [
    EarlyStopping(monitor='val_loss', patience=4, verbose=1, min_delta=1e-4),
    ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=2, cooldown=2, verbose=1),
    ModelCheckpoint(filepath='weights/renet50_best_weight.fold_' + str(fold_count) + '.hdf5', save_best_only=True,
                    save_weights_only=True)
    ]

model.load_weights(filepath="weights/renet50_best_weight.fold_1.hdf5")
model.fit_generator(generator=train_generator(), steps_per_epoch=len(df_train) // batch_size,  epochs=epochs, verbose=1,
                  callbacks=callbacks, validation_data=valid_generator(), validation_steps = len(df_valid) // batch_size)

错误如下:

Traceback (most recent call last):
File "/home/jamesben/ai_challenger/src/train.py", line 184, in <module> model.load_weights(filepath="weights/renet50_best_weight.fold_" + str(fold_count) + '.hdf5')
File "/usr/local/lib/python3.5/dist-packages/keras/models.py", line 719, in load_weights topology.load_weights_from_hdf5_group(f, layers)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 3095, in load_weights_from_hdf5_group K.batch_set_value(weight_value_tuples)
File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2193, in batch_set_value get_session().run(assign_ops, feed_dict=feed_dict)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 767, in run run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 944, in _run % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (128,) for Tensor 'Placeholder_72:0', which has shape '(3, 3, 128, 128)'

谁能给我一些关于我应该用 resnet50 卡住多少层的帮助？

最佳答案

使用 load_weights() 时和 save_weights()对于嵌套模型，如果 trainable 很容易出错设置不一样。
要解决此错误，请确保在调用 model.load_weights() 之前卡住相同的图层。 .也就是说，如果权重文件在所有层都卡住的情况下保存，则过程将是:

重新创建模型

卡住base_model中的所有层

加载砝码

解冻您要训练的那些层(在本例中为 base_model.layers[-26:] )

例如，

base_model = ResNet50(include_top=False, input_shape=(224, 224, 3))
model = Sequential()
model.add(base_model)
model.add(Flatten())
model.add(Dense(80, activation="softmax"))

for layer in base_model.layers:
    layer.trainable = False
model.load_weights('all_layers_freezed.h5')

for layer in base_model.layers[-26:]:
    layer.trainable = True

根本原因:
当您调用 model.load_weights() ，(粗略)每层的权重通过以下步骤加载(在 topology.py 中的函数 load_weights_from_hdf5_group() 中):

调用 layer.weights获得权重张量

将每个权重张量与hdf5文件

中对应的权重值匹配

调用 K.batch_set_value()将权重值分配给权重张量

如果您的模型是嵌套模型，则必须小心 trainable因为第 1 步。
我将用一个例子来解释它。对于与上述相同的型号， model.summary()给出:

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
resnet50 (Model)             (None, 1, 1, 2048)        23587712
_________________________________________________________________
flatten_10 (Flatten)         (None, 2048)              0
_________________________________________________________________
dense_5 (Dense)              (None, 80)                163920
=================================================================
Total params: 23,751,632
Trainable params: 11,202,640
Non-trainable params: 12,548,992
_________________________________________________________________

内 ResNet50模型被视为 model 的一层在负重加载过程中。加载图层时 resnet50 ，在步骤 1 中，调用 layer.weights相当于调用 base_model.weights . ResNet50 中所有层的权重张量列表模型将被收集并返回。
现在的问题是，在构建权重张量列表时， 可训练权重将出现在不可训练权重之前 .在 Layer 的定义中类(class):

@property
def weights(self):
    return self.trainable_weights + self.non_trainable_weights

如果 base_model 中的所有层被卡住，权重张量将按以下顺序排列:

for layer in base_model.layers:
    layer.trainable = False
print(base_model.weights)

[<tf.Variable 'conv1/kernel:0' shape=(7, 7, 3, 64) dtype=float32_ref>,
 <tf.Variable 'conv1/bias:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'bn_conv1/gamma:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'bn_conv1/beta:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'bn_conv1/moving_mean:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'bn_conv1/moving_variance:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'res2a_branch2a/kernel:0' shape=(1, 1, 64, 64) dtype=float32_ref>,
 <tf.Variable 'res2a_branch2a/bias:0' shape=(64,) dtype=float32_ref>,
 ...
 <tf.Variable 'res5c_branch2c/kernel:0' shape=(1, 1, 512, 2048) dtype=float32_ref>,
 <tf.Variable 'res5c_branch2c/bias:0' shape=(2048,) dtype=float32_ref>,
 <tf.Variable 'bn5c_branch2c/gamma:0' shape=(2048,) dtype=float32_ref>,
 <tf.Variable 'bn5c_branch2c/beta:0' shape=(2048,) dtype=float32_ref>,
 <tf.Variable 'bn5c_branch2c/moving_mean:0' shape=(2048,) dtype=float32_ref>,
 <tf.Variable 'bn5c_branch2c/moving_variance:0' shape=(2048,) dtype=float32_ref>]

但是，如果某些层是可训练的，则可训练层的权重张量将位于卡住层的权重张量之前:

for layer in base_model.layers[-5:]:
    layer.trainable = True
print(base_model.weights)

[<tf.Variable 'res5c_branch2c/kernel:0' shape=(1, 1, 512, 2048) dtype=float32_ref>,
 <tf.Variable 'res5c_branch2c/bias:0' shape=(2048,) dtype=float32_ref>,
 <tf.Variable 'bn5c_branch2c/gamma:0' shape=(2048,) dtype=float32_ref>,
 <tf.Variable 'bn5c_branch2c/beta:0' shape=(2048,) dtype=float32_ref>,
 <tf.Variable 'conv1/kernel:0' shape=(7, 7, 3, 64) dtype=float32_ref>,
 <tf.Variable 'conv1/bias:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'bn_conv1/gamma:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'bn_conv1/beta:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'bn_conv1/moving_mean:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'bn_conv1/moving_variance:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'res2a_branch2a/kernel:0' shape=(1, 1, 64, 64) dtype=float32_ref>,
 <tf.Variable 'res2a_branch2a/bias:0' shape=(64,) dtype=float32_ref>,
 ...
 <tf.Variable 'bn5c_branch2b/moving_mean:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'bn5c_branch2b/moving_variance:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'bn5c_branch2c/moving_mean:0' shape=(2048,) dtype=float32_ref>,
 <tf.Variable 'bn5c_branch2c/moving_variance:0' shape=(2048,) dtype=float32_ref>]

顺序的变化是为什么你得到一个关于张量形状的错误。 hdf5 文件中保存的权重值与上述第 2 步中的错误权重张量匹配。卡住所有图层时一切正常的原因是因为您的模型检查点也被保存，所有图层都被卡住，因此顺序是正确的。

可能更好的解决方案:
您可以使用函数式 API 来避免嵌套模型。例如，以下代码应该可以正常工作:

base_model = ResNet50(include_top=False, weights="imagenet", input_shape=(input_size, input_size, input_channels))
x = Flatten()(base_model.output)
x = Dense(80, activation="softmax")(x)
model = Model(base_model.input, x)

for layer in base_model.layers:
    layer.trainable = False
model.save_weights("all_nontrainable.h5")

base_model = ResNet50(include_top=False, weights="imagenet", input_shape=(input_size, input_size, input_channels))
x = Flatten()(base_model.output)
x = Dense(80, activation="softmax")(x)
model = Model(base_model.input, x)

for layer in base_model.layers[:-26]:
    layer.trainable = False
model.load_weights("all_nontrainable.h5")

关于neural-network - 微调resnet50时如何卡住一些图层，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/46610732/

文章推荐： user-interface - 如何在 WEKA GUI 中更改名义属性值顺序？

文章推荐： floating-point - 浮点错误的真实例子

文章推荐： ruby-on-rails-3.1 - Rails 3.1 Assets 在生产中没有指纹

networking - 在Elasticsearch集群中 “IN THE SAME NETWORK”是什么意思？
我正在使用AWS中的VM设置Elasticsearch集群。我知道每个节点都会自动尝试加入一个在同一网络中具有相同群集名称的现有群集。但是，我无法理解“同一网络” 是什么。为了了解同一网络，我发
javascript - Web3/元掩码 : Error: Contract has not been deployed to detected network (network/artifact mismatch) on Kovan network
我尝试部署一个已经存在于 Kovan 网络上的合约实例，以通过 web3 和 metamask 与其交互。首先，我将 metamask 设置为我的当前提供者，然后我部署了一个合约实例，如下所示:
docker - 创建默认 "bridge"网络 : cannot create network (docker0): conflicts with network (docker0): networks have same bridge name 时出错
停止 docker 后，它拒绝重新启动。它提示另一个名为 docker0 的网桥已经存在: level=warning msg="devmapper: Base device already exis
networking - 获取与 docker Network 关联的网桥名称
我正在使用“docker network create --d bridge mynet”创建一个 docker 网络。我想获取与此 docker 网络关联的网桥名称。我知道我可以使用“-o”来提供
networking - 如何使用Powershell配置Juniper Networks SA VPN连接设置
我的一位同事的VPN连接有问题。似乎他的操作系统重设了代理设置，并且他需要手动将其更改回。有没有办法使用Powershell设置VPN和代理？他正在使用Windows 7，因此可以使用Powersh
azure - 如何从Azure指标获取 "Network In"和 "Network Out"？
我在 Azure VM 中有一个虚拟机，我想获取网络输入/网络输出指标。在 Azure 门户中，我将诊断设置和指标设置为存储到选定的存储表中。但存储的指标与我在 Azure 门户中看到的指标之间存在
networking - docker network 连接到主机的第二个接口(interface)
我有一个用例，我的 Docker 容器的第二个接口(interface)需要共享主机的第二个网络接口(interface)的接口(interface)。这可能使用 docker network con
azure - 如何从Azure指标获取 "Network In"和 "Network Out"？
我在 Azure VM 中有一个虚拟机，我想获取网络输入/网络输出指标。在 Azure 门户中，我将诊断设置和指标设置为存储到选定的存储表中。但存储的指标与我在 Azure 门户中看到的指标之间存在
networking - docker : How to find the network my container is in?
我想了解一些关于 Docker 的事情: 如何找到我的容器所在的网络？我可以动态分离我的容器并附加到其他网络吗？怎么样？如果我有两个容器正在运行，如何检查这两个容器是否在同一个网络？我可以 pin
Error: googleMobileAds/error-code-network-error The ad request was unsuccessful due to network connectivity(错误：googleMobileAds/Error-Code-Network-Error由于网络连接，广告请求未成功)
我已经开发了一款使用Reaction Native和世博会的应用程序，并想在它的末尾添加一个横幅广告。当我在Android模拟器上的开发版本上运行应用程序时，应用程序的其余部分在没有应用程序的情况下运
networking - 连接 "Network is Unreachable"VPS Centos
我已经编辑了 eth0，但我犯了一个错误，我的 VPS 现在处于脱机状态，甚至无法连接到 ssh，并在故障恢复控制台显示以下消息: “网络不可达”。配置/编辑网络的命令是什么!？ Photo 最佳答
networking - GCE 实例无法访问 - 连接 : Network is unreachable
今天早上我启动了我的 GCE 实例，并且 4/6 完全无法访问。所有这些都在同一个 us-east1-d 区域中。 SSH 连接也无法正常工作，因此我使用串行控制台连接到有问题的实例之一。当我尝试
networking - VirtualBox 虚拟机中的 “connect: Network is unreachable”
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。想改进这个问题？将问题更新为 on-topic对于堆栈溢出。 5年前关闭。 Improve this qu
networking - 如何将 Network.Browser.browse 的日志静音到标准输出？
我正在使用 Network.Browser 4000.0.9 检索网页: import Network.Browser import Network.HTTP main = do (uri
networking - docker 连接(101 : Network is unreachable)
我正在尝试更新我在 docker 容器中的 apt 存储库，但我做不到。 docker run -it --dns 8.8.8.8 --dns 8.8.4.4 debian apt-get 更新 ..
networking - 超 V : Network Adapter Drivers
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。这个问题似乎不是关于 a specific programming problem, a softwar
javascript - axios能否区分 "no network"和 "network disconnected"
Axios 是否可以区分以下内容: 由于客户端没有网络连接而失败的请求发出请求的时间 - (ERR_CONNECTION_REFUSED)。由于网络连接丢失而失败的请求之后已发出请求，但在收到响应之
networking - 在新的 Unity Networking 中，RPC 相当于什么？
Unity 已升级其网络系统，并将旧网络称为遗留网络。那么我们如何将 RPC 调用更改为新的 Unity Networking？这种方法的等价物是什么？我们应该为它编写自己的方法吗？ (发送字节数组
neural-network - Vowpal 兔 : Input of neural network?
在机器学习工具 vowpal wabbit ( https://github.com/JohnLangford/vowpal_wabbit/ ) 中，通常训练线性估计器 y*=wx。但是，可以添加前向
networking - 为 IPv6 找到正确的 "network interface"号码
我正在尝试将 Boost 用于某些 IPv6 和多播网络通信。我需要构建一个使用特定网络接口(interface)索引的 IPv6 多播套接字。我能够在 boost/asio/ip/detail/s

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

neural-network - 微调resnet50时如何卡住一些图层