gpt4 book ai didi

python - 为什么这个简单的 Tensorflow 代码不成功? (使用 Tensorflow 的 ConvnetJS)

转载 作者:行者123 更新时间:2023-12-01 03:50:34 25 4
gpt4 key购买 nike

有一个简单且具有教育意义的玩具分类器(2 个完全连接的层)作为 JAVA 小程序:http://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html

这里,输入是带有 {0,1} 标签的 2D 点列表。正如您所看到的,他们定义的架构如下。

layer_defs = [];
layer_defs.push({type:'input', out_sx:1, out_sy:1, out_depth:2});
layer_defs.push({type:'fc', num_neurons:6, activation: 'tanh'});
layer_defs.push({type:'fc', num_neurons:2, activation: 'tanh'});
layer_defs.push({type:'softmax', num_classes:2});

我正在尝试使用如下所示的 tensorflow 来测试这一点。

pts = tf.placeholder(tf.float32, [None,2], name="p")
label = tf.placeholder(tf.int32, [None], name="labels")

with tf.variable_scope("layers") as scope:
fc1 = fc_layer(pts, [2, 6], "fc1")
fc1 = tf.nn.tanh(fc1)
fc2 = fc_layer(fc1, [6, 2], "fc2")
fc2 = tf.nn.tanh(fc2)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(fc2, label, name='cross_entropy_per_example')
cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
optimizer = tf.train.MomentumOptimizer(learning_rate, 0.9)
train_op = optimizer.minimize(cross_entropy_mean, global_step=global_step)

函数fc_layer只不过是

def fc_layer(bottom, weight_shape, name):
W = tf.get_variable(name+'W', shape=weight_shape, dtype=tf.float32, initializer=tf.random_normal_initializer(mean = 0.01,stddev=0.01))
b = tf.get_variable(name+'b', shape=[weight_shape[1]], dtype=tf.float32, initializer=tf.random_normal_initializer(mean = 0.01,stddev=0.01))
fc = tf.nn.bias_add(tf.matmul(bottom, W), b)
return fc

然而,损失似乎并没有减少。损失定义(交叉熵)有问题吗?

有人可以帮忙吗?

最佳答案

仔细一看,我觉得损失定义没有问题。

<小时/>

我发现一些参数定义与原来的ConvNetJS demo不同。不过,选择相同的参数并没有改变行为。

然后我意识到ConvNetJS页面没有解释权重是如何初始化的(快速搜索后在源代码中找不到,并且这里的代码示例隐藏在文本区域中:-P)。这是真正改变行为的一个问题。

影响结果的另一个参数是批量大小。

之前(平均值=0.01,偏差=0.01)

Original weights (mean=0.01, dev=0.01)

在(mean=0,dev=1/n)之后,n 个层的输入

New weights (mean=0, dev=1/n)

生成第二个图像的代码(将权重替换为原始值以获得第一个图像),学习识别两个输入数字何时具有相同的符号:

import tensorflow as tf
import random

# Training data
points = [[random.uniform(-1, 1), random.uniform(-1, 1)] for _ in range(1000000)]
labels = [1 if x * y > 0.0 else 0 for (x, y) in points]

batch_size = 100 # a divider of len(points) to keep things simple
momentum = 0.9
global_step=tf.Variable(0, trainable=False)
learning_rate = tf.train.exponential_decay(0.01, global_step, 10, 0.99, staircase=True)

###
### The original code, where `momentum` is now a variable,
### and the weights are initialized differently.
###
def fc_layer(bottom, weight_shape, name):
W = tf.get_variable(name+'W', shape=weight_shape, dtype=tf.float32, initializer=tf.random_normal_initializer(mean=0., stddev=(1/weight_shape[0])))
b = tf.get_variable(name+'b', shape=[weight_shape[1]], dtype=tf.float32, initializer=tf.random_normal_initializer(mean=0., stddev=(1/weight_shape[0])))
fc = tf.nn.bias_add(tf.matmul(bottom, W), b)
return fc

pts = tf.placeholder(tf.float32, [None,2], name="p")
label = tf.placeholder(tf.int32, [None], name="labels")

with tf.variable_scope("layers") as scope:
fc1 = fc_layer(pts, [2, 6], "fc1")
fc1 = tf.nn.tanh(fc1)
fc2 = fc_layer(fc1, [6, 2], "fc2")
fc2 = tf.nn.tanh(fc2)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(fc2, label, name='cross_entropy_per_example')
cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
optimizer = tf.train.MomentumOptimizer(learning_rate, momentum)
train_op = optimizer.minimize(cross_entropy_mean, global_step=global_step)
###

ce_summary = tf.scalar_summary('ce', cross_entropy_mean)

with tf.Session() as session:
all_summaries = tf.merge_all_summaries()
summarizer = tf.train.SummaryWriter('./log', session.graph)
tf.initialize_all_variables().run()
for i in range(len(points) // batch_size):
_, ce, cs = session.run([
train_op,
cross_entropy_mean,
ce_summary
],
{
pts: points[i:(i + batch_size)],
label: labels[i:(i + batch_size)]
})
summarizer.add_summary(cs, global_step=tf.train.global_step(session, global_step))
print(ce)

这似乎还不是网络能做到的最好的,但交叉熵确实减少了!

关于python - 为什么这个简单的 Tensorflow 代码不成功? (使用 Tensorflow 的 ConvnetJS),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38321024/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com