gpt4 book ai didi

machine-learning - 使用神经网络学习方波函数

转载 作者:行者123 更新时间:2023-11-30 08:43:28 25 4
gpt4 key购买 nike

出于好奇,我尝试使用 tensorflow 构建一个简单的全连接神经网络来学习方波函数,如下所示:Credits to www.thedawstudio.com

因此输入是 x 值(作为水平轴)的一维数组,输出是二进制标量值。我用过tf.nn.sparse_softmax_cross_entropy_with_logits作为损失函数,并且 tf.nn.relu作为激活。有 3 个隐藏层 (100*100*100) 和单个输入节点和输出节点。生成的输入数据与上述波形相匹配,因此数据大小不是问题。

但是,训练后的模型似乎无法完成,总是预测负类。

所以我试图找出为什么会发生这种情况。神经网络的配置是否不是最优的,或者是由于神经网络表面之下的一些数学缺陷(尽管我认为神经网络应该能够模仿任何函数)。

谢谢。

<小时/>

根据评论部分的建议,这是完整的代码。我注意到之前说错的一件事是,实际上有 2 个输出节点(由于 2 个输出类):

"""
See if neural net can find piecewise linear correlation in the data
"""

import time
import os
import tensorflow as tf
import numpy as np

def generate_placeholder(batch_size):
x_placeholder = tf.placeholder(tf.float32, shape=(batch_size, 1))
y_placeholder = tf.placeholder(tf.float32, shape=(batch_size))
return x_placeholder, y_placeholder

def feed_placeholder(x, y, x_placeholder, y_placeholder, batch_size, loop):
x_selected = [[None]] * batch_size
y_selected = [None] * batch_size
for i in range(batch_size):
x_selected[i][0] = x[min(loop*batch_size, loop*batch_size % len(x)) + i, 0]
y_selected[i] = y[min(loop*batch_size, loop*batch_size % len(y)) + i]
feed_dict = {x_placeholder: x_selected,
y_placeholder: y_selected}
return feed_dict

def inference(input_x, H1_units, H2_units, H3_units):

with tf.name_scope('H1'):
weights = tf.Variable(tf.truncated_normal([1, H1_units], stddev=1.0/2), name='weights')
biases = tf.Variable(tf.zeros([H1_units]), name='biases')
a1 = tf.nn.relu(tf.matmul(input_x, weights) + biases)

with tf.name_scope('H2'):
weights = tf.Variable(tf.truncated_normal([H1_units, H2_units], stddev=1.0/H1_units), name='weights')
biases = tf.Variable(tf.zeros([H2_units]), name='biases')
a2 = tf.nn.relu(tf.matmul(a1, weights) + biases)

with tf.name_scope('H3'):
weights = tf.Variable(tf.truncated_normal([H2_units, H3_units], stddev=1.0/H2_units), name='weights')
biases = tf.Variable(tf.zeros([H3_units]), name='biases')
a3 = tf.nn.relu(tf.matmul(a2, weights) + biases)

with tf.name_scope('softmax_linear'):
weights = tf.Variable(tf.truncated_normal([H3_units, 2], stddev=1.0/np.sqrt(H3_units)), name='weights')
biases = tf.Variable(tf.zeros([2]), name='biases')
logits = tf.matmul(a3, weights) + biases

return logits

def loss(logits, labels):
labels = tf.to_int32(labels)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=logits, name='xentropy')
return tf.reduce_mean(cross_entropy, name='xentropy_mean')

def inspect_y(labels):
return tf.reduce_sum(tf.cast(labels, tf.int32))

def training(loss, learning_rate):
tf.summary.scalar('lost', loss)
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
global_step = tf.Variable(0, name='global_step', trainable=False)
train_op = optimizer.minimize(loss, global_step=global_step)
return train_op

def evaluation(logits, labels):
labels = tf.to_int32(labels)
correct = tf.nn.in_top_k(logits, labels, 1)
return tf.reduce_sum(tf.cast(correct, tf.int32))

def run_training(x, y, batch_size):
with tf.Graph().as_default():
x_placeholder, y_placeholder = generate_placeholder(batch_size)
logits = inference(x_placeholder, 100, 100, 100)
Loss = loss(logits, y_placeholder)
y_sum = inspect_y(y_placeholder)
train_op = training(Loss, 0.01)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
max_steps = 10000
for step in range(max_steps):
start_time = time.time()
feed_dict = feed_placeholder(x, y, x_placeholder, y_placeholder, batch_size, step)
_, loss_val = sess.run([train_op, Loss], feed_dict = feed_dict)
duration = time.time() - start_time
if step % 100 == 0:
print('Step {}: loss = {:.2f} {:.3f}sec'.format(step, loss_val, duration))
x_test = np.array(range(1000)) * 0.001
x_test = np.reshape(x_test, (1000, 1))
_ = sess.run(logits, feed_dict={x_placeholder: x_test})
print(min(_[:, 0]), max(_[:, 0]), min(_[:, 1]), max(_[:, 1]))
print(_)

if __name__ == '__main__':

population = 10000

input_x = np.random.rand(population)
input_y = np.copy(input_x)

for bin in range(10):
print(bin, bin/10, 0.5 - 0.5*(-1)**bin)
input_y[input_x >= bin/10] = 0.5 - 0.5*(-1)**bin

batch_size = 1000

input_x = np.reshape(input_x, (population, 1))

run_training(input_x, input_y, batch_size)

示例输出显示模型始终更喜欢第一个类别而不是第二个类别,如 min(_[:, 0]) > max(_[:, 1]) 所示,即对于总体的样本量,第一类的最小logit输出高于第二类的最大logit输出。

<小时/>

我的错误。问题发生在以下行:

for i in range(batch_size):
x_selected[i][0] = x[min(loop*batch_size, loop*batch_size % len(x)) + i, 0]
y_selected[i] = y[min(loop*batch_size, loop*batch_size % len(y)) + i]

Python 正在将 x_selected 的整个列表更改为相同的值。现在这个代码问题已经解决了。修复方法是:

x_selected = np.zeros((batch_size, 1))
y_selected = np.zeros((batch_size,))
for i in range(batch_size):
x_selected[i, 0] = x[(loop*batch_size + i) % x.shape[0], 0]
y_selected[i] = y[(loop*batch_size + i) % y.shape[0]]

此修复后,模型显示出更多变化。目前,当 x <= 0.5 时,它输出类 0;当 x > 0.5 时,它输出类 1。但这还远远不够理想。

<小时/>

因此,将网络配置更改为 100 个节点 * 4 层后,经过 100 万个训练步骤(批量大小 = 100,样本大小 = 1000 万),模型表现非常好,仅在 y 翻转时显示边缘错误。因此这个问题就结束了。

最佳答案

你本质上是在尝试学习 periodic function并且该函数是高度非线性和非平滑的。所以它并不像看起来那么简单。简而言之,更好地表示输入特征会有所帮助。

假设您有一个周期T = 2f(x) = f(x+2)。对于输入/输出为整数时的简化问题,您的函数是 f(x) = 1 if x is odd else -1。 在这种情况下,您的问题将简化为 this discussion我们训练神经网络来区分奇数和偶数。

我想该文章中的第二个项目符号应该有所帮助(即使对于输入为 float 的一般情况)。

Try representing the numbers in binary using a fixed length precision.

在上面的简化问题中,很容易看出,如果最低有效位已知,则输出已确定。

decimal  binary  -> output
1: 0 0 1 -> 1
2: 0 1 0 -> -1
3: 0 1 1 -> 1
...

关于machine-learning - 使用神经网络学习方波函数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46210876/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com