python - 使用预训练模型在 tensorflow 中训练新模型-6ren

python - 使用预训练模型在 tensorflow 中训练新模型

转载作者：行者123 更新时间：2023-11-30 08:48:42

我正在创建一个 CNN 自动编码器来充当特征提取器，然后是 tensorflow 中的简单 MLP 分类器。我单独进行训练，因此我首先训练自动编码器将数据编码到较低维的特征空间，然后通过将输入传递给经过训练的自动编码器，然后通过 MLP 来单独训练 MLP 分类器。

我目前在连接两个模型时遇到问题。我的方法是加载旧图，并获取输出和原始输入张量的占位符。然后，我还在原始图的最后一层上创建了一个停止梯度，这样我就只训练 MLP 而不是自动编码器。然后，我使用变量范围仅初始化新图的变量。

当我运行代码时，我遇到了多个错误，从未初始化的变量到变量太多。有一个更好的方法吗？我将添加下面的代码。

自动编码器训练代码

import tensorflow as tf
import numpy as np
import math

def lrelu(x, leak=0.2, name="lrelu"):
    """Leaky rectifier.
    Parameters
    ----------
    x : Tensor
        The tensor to apply the nonlinearity to.
    leak : float, optional
        Leakage parameter.
    name : str, optional
        Variable scope to use.
    Returns
    -------
    x : Tensor
        Output of the nonlinearity.
    """
    with tf.variable_scope(name):
        f1 = 0.5 * (1 + leak)
        f2 = 0.5 * (1 - leak)
        return f1 * x + f2 * abs(x)

def corrupt(x):
    """Take an input tensor and add uniform masking.
    Parameters
    ----------
    x : Tensor/Placeholder
        Input to corrupt.
    Returns
    -------
    x_corrupted : Tensor
        50 pct of values corrupted.
    """
    return tf.multiply(x, tf.cast(tf.random_uniform(shape=tf.shape(x),
                                               minval=0,
                                               maxval=2,
                                               dtype=tf.int32), tf.float32))

def autoencoder(input_shape = [None, 784],
               n_filters = [1, 10, 10, 10],
               filter_sizes = [3, 3, 3, 3],
               corruption = False):
    """Build a deep denoising autoencoder w/ tied weights.
    Parameters
    ----------
    input_shape : list, optional
        Description
    n_filters : list, optional
        Description
    filter_sizes : list, optional
        Description
    Returns
    -------
    x : Tensor
        Input placeholder to the network
    z : Tensor
        Inner-most latent representation
    y : Tensor
        Output reconstruction of the input
    cost : Tensor
        Overall cost to use for training
    Raises
    ------
    ValueError
        Description
    """

    # Input to network
    x = tf.placeholder(tf.float32, input_shape, name = 'x')
    print(x)

    # Convert 2D input is converted to square
    if len(x.get_shape()) == 2:
        x_dim = np.sqrt(x.get_shape().as_list()[1])
        if x_dim != int(x_dim):
            raise ValueError('Unsupported Input Dimensions')
        x_dim = int(x_dim)
        x_tensor = tf.reshape(x, [-1, x_dim, x_dim, n_filters[0]])
    elif len(x.get_shape()) == 4:
        x_tensor = x
    else:
        raise ValueError('Unsupported Input Dimensions')
    current_input = x_tensor

    # Optionally apply denoising autoencoder
    if corruption:
        current_input = corrupt(current_input)

    # Encoder
    encoder = []
    shapes = []
    for layer_i, n_output in enumerate(n_filters[1:]):
        n_input = current_input.get_shape().as_list()[3] # This will be # Channels
        shapes.append(current_input.get_shape().as_list())
        W = tf.Variable(
            tf.random_uniform([
                filter_sizes[layer_i],
                filter_sizes[layer_i],
                n_input, n_output],
                -1.0 / math.sqrt(n_input),
                1.0/math.sqrt(n_input))) # This is so we don't have to initialize ourselves
        b = tf.Variable(tf.zeros([n_output]))
        encoder.append(W)
        output = lrelu(
            tf.add(tf.nn.conv2d(
                current_input, W, strides = [1,2,2,1], padding = 'SAME'), b))
        current_input = output
        print(W)
        print(b)
        print(output)

    # Store the latent representation
    z = current_input
    print(z)
    encoder.reverse()
    shapes.reverse()

    for layer_i, shape in enumerate(shapes):
        W = encoder[layer_i]
        b = tf.Variable(tf.zeros([W.get_shape().as_list()[2]]))
        output = lrelu(tf.add(
            tf.nn.conv2d_transpose(
                current_input, W,
                tf.stack([tf.shape(x)[0], shape[1], shape[2], shape[3]]),
                strides = [1,2,2,1], padding = 'SAME'), b))
        current_input = output

    # Now we have a reconstruction
    y = current_input
    cost = tf.reduce_sum(tf.square(y - x_tensor))

    return {'x': x, 'z': z, 'y': y, 'cost': cost}

# %%
def test_mnist():
    """Test the convolutional autoencder using MNIST."""
    # %%
    import tensorflow as tf
    import tensorflow.examples.tutorials.mnist.input_data as input_data
    import matplotlib.pyplot as plt

    # %%
    # load MNIST as before
    mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
    mean_img = np.mean(mnist.train.images, axis=0)
    ae = autoencoder()

    # %%
    learning_rate = 0.01
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(ae['cost'])

    # Create saver
    saver = tf.train.Saver(tf.trainable_variables())

    # %%
    # We create a session to use the graph
    sess = tf.Session()
    sess.run(tf.global_variables_initializer())

    # %%
    # Fit all training data
    batch_size = 100
    n_epochs = 1
    for epoch_i in range(n_epochs):
        for batch_i in range(mnist.train.num_examples // batch_size):
            batch_xs, _ = mnist.train.next_batch(batch_size)
            train = np.array([img - mean_img for img in batch_xs])
            sess.run(optimizer, feed_dict={ae['x']: train})
        print(epoch_i, sess.run(ae['cost'], feed_dict={ae['x']: train}))

    save_path = saver.save(sess, "AutoEncoderCheckpoints/AutoEncoderMNIST.ckpt")
    print("Model saved in path: %s" % save_path)

    # %%
    # Plot example reconstructions
    n_examples = 10
    test_xs, _ = mnist.test.next_batch(n_examples)
    test_xs_norm = np.array([img - mean_img for img in test_xs])
    recon, latent = sess.run([ae['y'], ae['z']], feed_dict={ae['x']: test_xs_norm})
    print(recon.shape)
    print(latent.shape)
    fig, axs = plt.subplots(2, n_examples, figsize=(20, 6))
    for example_i in range(n_examples):
        axs[0][example_i].imshow(
            np.reshape(test_xs[example_i, :], (28, 28)))
        axs[1][example_i].imshow(
            np.reshape(
                np.reshape(recon[example_i, ...], (784,)) + mean_img,
                (28, 28)))
    fig.show()
    plt.draw()
#     plt.waitforbuttonpress()

    new_fig, new_axs = plt.subplots(10, n_examples, figsize = (20,20))
    for chan in range(10):
        for example_i in range(n_examples):
            new_axs[chan][example_i].imshow(
            np.reshape(latent[example_i,...,chan],
            (4,4)))
    new_fig.show()
    plt.draw()

# %%
if __name__ == '__main__':
    test_mnist()

代码无法在不重新训练自动编码器的情况下训练 MLP

aeMLP_saver = tf.train.import_meta_graph('AutoEncoderCheckpoints/AutoEncoderMNIST.ckpt.meta')
aeMLP_graph = tf.get_default_graph()

weights = {
    'h1': tf.Variable(tf.random_normal([160, 320])),
    'h2': tf.Variable(tf.random_normal([320, 640])),
    'out': tf.Variable(tf.random_normal([640, 10]))
}
biases = {
    'b1': tf.Variable(tf.random_normal([320])),
    'b2': tf.Variable(tf.random_normal([640])),
    'out': tf.Variable(tf.random_normal([10]))
}

# with tf.Graph().as_default():
with tf.variable_scope("model2"):
    x_plh = aeMLP_graph.get_tensor_by_name('x:0')
    output_conv = aeMLP_graph.get_tensor_by_name('lrelu_2/add:0')

    output_conv_sg = tf.stop_gradient(output_conv)
    print(output_conv_sg)

    output_conv_shape = output_conv_sg.get_shape().as_list()
    print(output_conv_shape)

    new_input = tf.reshape(output_conv_sg, [-1, 160])

    Y = tf.placeholder("float", [None, 10])
    # Hidden fully connected layer with 256 neurons
    layer_1 = tf.add(tf.matmul(new_input, weights['h1']), biases['b1'])
    # Hidden fully connected layer with 256 neurons
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    # Output fully connected layer with a neuron for each class
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    print(layer_1)
    print(layer_2)
    print(out_layer)
    y_pred = tf.nn.softmax(out_layer)

    correct_prediction = tf.equal(tf.argmax(y_pred,1), tf.argmax(Y,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out_layer, labels=Y))
    learning_rate = 0.001
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss_op)


# out_layer_mlp, y_pred = multilayer_perceptron(new_input)

model_2_variables_list = tf.get_collection(
tf.GraphKeys.GLOBAL_VARIABLES, 
scope="model2"
)

print(model_2_variables_list)

init2 = tf.variables_initializer(model_2_variables_list)

import tensorflow as tf
import tensorflow.examples.tutorials.mnist.input_data as input_data
import matplotlib.pyplot as plt

# %%
# load MNIST as before
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
mean_img = np.mean(mnist.train.images, axis=0)

# Create saver
saver_new = tf.train.Saver()

with tf.Session() as sess:
    sess.run(init2)

     # %%
    # Fit all training data
    batch_size = 100
    n_epochs = 1
    for epoch_i in range(n_epochs):
        for batch_i in range(mnist.train.num_examples // batch_size):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            train = np.array([img - mean_img for img in batch_xs])
            _,c = sess.run([optimizer, loss_op], feed_dict={x_plh: train, Y: batch_ys})
        print(epoch_i, " || ", c)
        batch_xt, batch_yt = mnist.test.next_batch(batch_size)
        test = train = np.array([img - mean_img for img in batch_xt])
        acc = sess.run(accuracy, feed_dict = {x_plh: test, Y: batch_yt})
        print("Accuracy is: ", acc)

    save_path = saver_new.save(sess, "AutoEncoderCheckpoints/AutoEncoderClassifierMNIST.ckpt")
    print("Model saved in path: %s" % save_path)

上面的两个代码都是可运行的，因此您将能够重现我遇到的错误。我读过一些关于可能卡住图表的帖子，但我不确定这是否是最好的解决方案。

最佳答案

如果您确实包含了您遇到的错误，这篇文章对其他人会更有用。

第一个明显的问题是导入图tf.train.import_meta_graph不会初始化变量。请参阅https://www.tensorflow.org/api_docs/python/tf/train/import_meta_graph有关调用 restore 以实际恢复变量值的示例。

在较高级别上，由于您拥有构建原始训练图的代码，因此可能不需要进行保存/恢复。一种可能的方法是构建整个图(AE 和 MLP)。首先训练 AE(通过使用 AE 的训练操作调用 sess.run)，然后 stop_gradients 并训练 MLP。您还可以构建单独的塔来共享您想要的变量。我建议不要进行保存/恢复(除非您有其他用例)的原因是因为依赖张量名称可能很脆弱。

关于python - 使用预训练模型在 tensorflow 中训练新模型，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51269109/

文章推荐： python-3.x - 用 Python 绘制烛台图

文章推荐： javascript - JQuery Ajax 将 html 数据发布到 php 并获得回复

文章推荐： java - 使用 Retrofit 库的 NullPointerException - Android

javascript - 我需要将文本放在一个中，它位于一个 Div 中，该 Div 位于另一个 Div 中，该 Div 位于另一个 Div 中
我需要将文本放在中在一个 Div 中，在另一个 Div 中，在另一个 Div 中。所以这是它的样子: #document Change PIN
html - 两个背景图像。一个在 HTML 中，一个在 BODY 中。在 Firefox 中，主体图像未呈现
奇怪的事情发生了。我有一个基本的 html 代码。 html，头部， body 。(因为我收到了一些反对票，这里是完整的代码) 这是我的CSS: html { backgroun
ios - 将图像从 asset.xcassets 加载到 imageArray 中，并将其动态加载到 UIImageView 中，该 UIImageView 存在于 UICollectionView 中 - swift
我正在尝试将 Assets 中的一组图像加载到 UICollectionview 中存在的 ImageView 中，但每当我运行应用程序时它都会显示错误。而且也没有显示图像。我在ViewDidLoa
linux - 在 BASH 中，我需要根据 perl 脚本的输出更改一些环境变量。在 tcsh 中，我可以使用别名 eval 组合。不能在 bash 中
我需要根据带参数的 perl 脚本的输出更改一些环境变量。在 tcsh 中，我可以使用别名命令来评估 perl 脚本的输出。 tcsh: alias setsdk 'eval `/localhome/
asp.net - Windows 身份验证适用于 IIS，但不适用于 Kestrel/Microsoft.AspNetCore.Authentication.Negotiate(不在 Chrome 中，有时在 Edge 中，始终在 IE 中)？
我使用 Windows 身份验证创建了一个新的 Blazor(服务器端)应用程序，并使用 IIS Express 运行它。它将显示一条消息“Hello Domain\User!”来自右上方的以下 Ra
java - java 中 Kotlin 中的等价物是什么？
这是我的方法 void login(Event event);我想知道 Kotlin 中应该如何最佳答案在 Kotlin 中通配符运算符是 * 。它指示编译器它是未知的，但一旦知道，就不会有其他类
express - 在 Jade 中，为什么有时我可以按原样使用变量而有时必须将它们包含在#{......} 中？
看下面的代码 for story in book if story.title.length < 140 - var story
c - C 中 strstr() 中 for 循环的错误使用
我正在尝试用 C 语言学习字符串处理。我写了一个程序，它存储了一些音乐轨道，并帮助用户检查他/她想到的歌曲是否存在于存储的轨道中。这是通过要求用户输入一串字符来完成的。然后程序使用 strstr()
c - * 在 sscanf 中，* 在 [] 中
我正在学习 sscanf 并遇到如下格式字符串: sscanf("%[^:]:%[^*=]%*[*=]%n",a,b,&c); 我理解 %[^:] 部分意味着扫描直到遇到 ':' 并将其分配给 a。:
python - 在 Python (2.7.3) 中，如果 str(x) 中的任何字符在 str(y) 中(或 str(y) 在 str(x) 中)，我如何编写一个函数来回答？
def char_check(x,y): if (str(x) in y or x.find(y) > -1) or (str(y) in x or y.find(x) > -1):
ansible - 在 Ansible 中，如何将一行移动到一个 block 中？
我有一种情况，我想将文本文件中的现有行包含到一个新 block 中。 line 1 line 2 line in block line 3 line 4 应该变成 line 1 line 2 line
Django 调试工具栏显示在根 URL 中，但不显示在应用程序 URL 中
我有一个新项目，我正在尝试设置 Django 调试工具栏。首先，我尝试了快速设置，它只涉及将 'debug_toolbar' 添加到我的已安装应用程序列表中。有了这个，当我转到我的根 URL 时，调试
r - 在 R 中，Matlab 中 @ 函数句柄的等价物是什么？
在 Matlab 中，如果我有一个函数 f，例如签名是 f(a,b,c)，我可以创建一个只有一个变量 b 的函数，它将使用固定的 a=a1 和 c=c1 调用 f: g = @(b) f(a1, b,
swiftui - SwiftUI 中 ScrollView 中 VStack 元素中的神秘间距或填充
我不明白为什么 ForEach 中的元素之间有多余的垂直间距在 VStack 里面在 ScrollView 里面使用 GeometryReader 时渲染自定义水平分隔线。 Scrol
cookies - 什么应该存储在 session 中，什么应该存储在 cookie 中？
我想知道，是否有关于何时使用 session 和 cookie 的指南或最佳实践？什么应该和什么不应该存储在其中？谢谢! 最佳答案这些文档很好地了解了 session cookie 的安全问题以及
python - Python 中 matplotlib 中 3d 直方图的奇怪行为
我在 scipy/numpy 中有一个 Nx3 矩阵，我想用它制作一个 3 维条形图，其中 X 轴和 Y 轴由矩阵的第一列和第二列的值、高度确定每个条形的是矩阵中的第三列，条形的数量由 N 确定。
c - c 中 sem_init(...) 中 value 参数的不同用法
假设我用两种不同的方式初始化信号量 sem_init(&randomsem,0,1) sem_init(&randomsem,0,0) 现在， sem_wait(&randomsem) 在这两种情况下
c - 实际值存储在 pstr 中，但是该值如何存储在数组 "WORD"中
我怀疑该值如何存储在“WORD”中，因为 PStr 包含实际输出。？既然Pstr中存储的是小写到大写的字母，那么在printf中如何将其给出为“WORD”。有人可以吗？解释一下？ #include
javascript - 数组索引选择像在 numpy 中，但在 javascript 中
我有一个 3x3 数组: var my_array = [[0,1,2], [3,4,5], [6,7,8]]; 并想获得它的第一个 2
javascript - 在 Javascript 中，如何检测浏览器窗口何时在 View 中？
我意识到您可以使用如下方式轻松检查焦点: var hasFocus = true; $(window).blur(function(){ hasFocus = false; }); $(win

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 使用预训练模型在 tensorflow 中训练新模型