gpt4 book ai didi

解决Keras TensorFlow 混编中 trainable=False设置无效问题

转载 作者:qq735679552 更新时间:2022-09-29 22:32:09 29 4
gpt4 key购买 nike

CFSDN坚持开源创造价值,我们致力于搭建一个资源共享平台,让每一个IT人在这里找到属于你的精彩世界.

这篇CFSDN的博客文章解决Keras TensorFlow 混编中 trainable=False设置无效问题由作者收集整理,如果你对这篇文章有兴趣,记得点赞哟.

这是最近碰到一个问题,先描述下问题:

首先我有一个训练好的模型(例如vgg16),我要对这个模型进行一些改变,例如添加一层全连接层,用于种种原因,我只能用TensorFlow来进行模型优化,tf的优化器,默认情况下对所有tf.trainable_variables()进行权值更新,问题就出在这,明明将vgg16的模型设置为trainable=False,但是tf的优化器仍然对vgg16做权值更新 。

以上就是问题描述,经过谷歌百度等等,终于找到了解决办法,下面我们一点一点的来复原整个问题.

trainable=False 无效 。

首先,我们导入训练好的模型vgg16,对其设置成trainable=False 。

?
1
2
3
from keras.applications import VGG16
import tensorflow as tf
from keras import layers
?
1
2
3
4
# 导入模型
base_mode = VGG16(include_top = False )
# 查看可训练的变量
tf.trainable_variables()
?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
[<tf.Variable 'block1_conv1/kernel:0' shape = ( 3 , 3 , 3 , 64 ) dtype = float32_ref>,
  <tf.Variable 'block1_conv1/bias:0' shape = ( 64 ,) dtype = float32_ref>,
  <tf.Variable 'block1_conv2/kernel:0' shape = ( 3 , 3 , 64 , 64 ) dtype = float32_ref>,
  <tf.Variable 'block1_conv2/bias:0' shape = ( 64 ,) dtype = float32_ref>,
  <tf.Variable 'block2_conv1/kernel:0' shape = ( 3 , 3 , 64 , 128 ) dtype = float32_ref>,
  <tf.Variable 'block2_conv1/bias:0' shape = ( 128 ,) dtype = float32_ref>,
  <tf.Variable 'block2_conv2/kernel:0' shape = ( 3 , 3 , 128 , 128 ) dtype = float32_ref>,
  <tf.Variable 'block2_conv2/bias:0' shape = ( 128 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv1/kernel:0' shape = ( 3 , 3 , 128 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv1/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv2/kernel:0' shape = ( 3 , 3 , 256 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv2/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv3/kernel:0' shape = ( 3 , 3 , 256 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv3/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv1/kernel:0' shape = ( 3 , 3 , 256 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv2/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv2/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv3/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv3/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv2/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv2/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv3/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv3/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block1_conv1_1/kernel:0' shape = ( 3 , 3 , 3 , 64 ) dtype = float32_ref>,
  <tf.Variable 'block1_conv1_1/bias:0' shape = ( 64 ,) dtype = float32_ref>,
  <tf.Variable 'block1_conv2_1/kernel:0' shape = ( 3 , 3 , 64 , 64 ) dtype = float32_ref>,
  <tf.Variable 'block1_conv2_1/bias:0' shape = ( 64 ,) dtype = float32_ref>,
  <tf.Variable 'block2_conv1_1/kernel:0' shape = ( 3 , 3 , 64 , 128 ) dtype = float32_ref>,
  <tf.Variable 'block2_conv1_1/bias:0' shape = ( 128 ,) dtype = float32_ref>,
  <tf.Variable 'block2_conv2_1/kernel:0' shape = ( 3 , 3 , 128 , 128 ) dtype = float32_ref>,
  <tf.Variable 'block2_conv2_1/bias:0' shape = ( 128 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv1_1/kernel:0' shape = ( 3 , 3 , 128 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv1_1/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv2_1/kernel:0' shape = ( 3 , 3 , 256 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv2_1/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv3_1/kernel:0' shape = ( 3 , 3 , 256 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv3_1/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv1_1/kernel:0' shape = ( 3 , 3 , 256 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv1_1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv2_1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv2_1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv3_1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv3_1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv1_1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv1_1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv2_1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv2_1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv3_1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv3_1/bias:0' shape = ( 512 ,) dtype = float32_ref>]
?
1
2
3
4
# 设置 trainable=False
# base_mode.trainable = False似乎也是可以的
for layer in base_mode.layers:
   layer.trainable = False

设置好trainable=False后,再次查看可训练的变量,发现并没有变化,也就是说设置无效 。

# 再次查看可训练的变量 tf.trainable_variables() 。

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
[<tf.Variable 'block1_conv1/kernel:0' shape = ( 3 , 3 , 3 , 64 ) dtype = float32_ref>,
  <tf.Variable 'block1_conv1/bias:0' shape = ( 64 ,) dtype = float32_ref>,
  <tf.Variable 'block1_conv2/kernel:0' shape = ( 3 , 3 , 64 , 64 ) dtype = float32_ref>,
  <tf.Variable 'block1_conv2/bias:0' shape = ( 64 ,) dtype = float32_ref>,
  <tf.Variable 'block2_conv1/kernel:0' shape = ( 3 , 3 , 64 , 128 ) dtype = float32_ref>,
  <tf.Variable 'block2_conv1/bias:0' shape = ( 128 ,) dtype = float32_ref>,
  <tf.Variable 'block2_conv2/kernel:0' shape = ( 3 , 3 , 128 , 128 ) dtype = float32_ref>,
  <tf.Variable 'block2_conv2/bias:0' shape = ( 128 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv1/kernel:0' shape = ( 3 , 3 , 128 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv1/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv2/kernel:0' shape = ( 3 , 3 , 256 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv2/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv3/kernel:0' shape = ( 3 , 3 , 256 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv3/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv1/kernel:0' shape = ( 3 , 3 , 256 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv2/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv2/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv3/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv3/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv2/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv2/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv3/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv3/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block1_conv1_1/kernel:0' shape = ( 3 , 3 , 3 , 64 ) dtype = float32_ref>,
  <tf.Variable 'block1_conv1_1/bias:0' shape = ( 64 ,) dtype = float32_ref>,
  <tf.Variable 'block1_conv2_1/kernel:0' shape = ( 3 , 3 , 64 , 64 ) dtype = float32_ref>,
  <tf.Variable 'block1_conv2_1/bias:0' shape = ( 64 ,) dtype = float32_ref>,
  <tf.Variable 'block2_conv1_1/kernel:0' shape = ( 3 , 3 , 64 , 128 ) dtype = float32_ref>,
  <tf.Variable 'block2_conv1_1/bias:0' shape = ( 128 ,) dtype = float32_ref>,
  <tf.Variable 'block2_conv2_1/kernel:0' shape = ( 3 , 3 , 128 , 128 ) dtype = float32_ref>,
  <tf.Variable 'block2_conv2_1/bias:0' shape = ( 128 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv1_1/kernel:0' shape = ( 3 , 3 , 128 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv1_1/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv2_1/kernel:0' shape = ( 3 , 3 , 256 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv2_1/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block3_conv3_1/kernel:0' shape = ( 3 , 3 , 256 , 256 ) dtype = float32_ref>,
  <tf.Variable 'block3_conv3_1/bias:0' shape = ( 256 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv1_1/kernel:0' shape = ( 3 , 3 , 256 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv1_1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv2_1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv2_1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block4_conv3_1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block4_conv3_1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv1_1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv1_1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv2_1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv2_1/bias:0' shape = ( 512 ,) dtype = float32_ref>,
  <tf.Variable 'block5_conv3_1/kernel:0' shape = ( 3 , 3 , 512 , 512 ) dtype = float32_ref>,
  <tf.Variable 'block5_conv3_1/bias:0' shape = ( 512 ,) dtype = float32_ref>]

解决的办法 。

解决的办法就是在导入模型的时候建立一个variable_scope,将需要训练的变量放在另一个variable_scope,然后通过tf.get_collection获取需要训练的变量,最后通过tf的优化器中var_list指定需要训练的变量 。

?
1
2
3
4
5
6
7
8
from keras import models
with tf.variable_scope( 'base_model' ):
   base_model = VGG16(include_top = False , input_shape = ( 224 , 224 , 3 ))
with tf.variable_scope( 'xxx' ):
   model = models.Sequential()
   model.add(base_model)
   model.add(layers.Flatten())
   model.add(layers.Dense( 10 ))
?
1
2
3
# 获取需要训练的变量
trainable_var = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'xxx' )
trainable_var

[<tf.Variable 'xxx_2/dense_1/kernel:0' shape=(25088, 10) dtype=float32_ref>, <tf.Variable 'xxx_2/dense_1/bias:0' shape=(10,) dtype=float32_ref>] 。

?
1
2
3
# 定义tf优化器进行训练,这里假设有一个loss
loss = model.output / 2 ; # 随便定义的,方便演示
train_step = tf.train.AdamOptimizer().minimize(loss, var_list = trainable_var)

总结 。

在keras与TensorFlow混编中,keras中设置trainable=False对于TensorFlow而言并不起作用 。

解决的办法就是通过variable_scope对变量进行区分,在通过tf.get_collection来获取需要训练的变量,最后通过tf优化器中var_list指定训练 。

以上这篇解决Keras TensorFlow 混编中 trainable=False设置无效问题就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持我.

原文链接:https://blog.csdn.net/weiwei9363/article/details/79673201 。

最后此篇关于解决Keras TensorFlow 混编中 trainable=False设置无效问题的文章就讲到这里了,如果你想了解更多关于解决Keras TensorFlow 混编中 trainable=False设置无效问题的内容请搜索CFSDN的文章或继续浏览相关文章,希望大家以后支持我的博客! 。

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com