gpt4 book ai didi

tensorflow - 神经网络可以处理冗余输入吗?

转载 作者:行者123 更新时间:2023-12-04 04:12:53 25 4
gpt4 key购买 nike

我有一个完全连接的神经网络,每层的神经元数量如下 [4, 20, 20, 20, ..., 1]。我正在使用 TensorFlow,4 个实数值输入对应于空间和时间中的特定点,即 (x, y, z, t),1 个实数值输出对应于该点的温度。损失函数只是我预测的温度与 (x, y, z, t) 中该点的实际温度之间的均方误差。我有一组训练数据点,其输入结构如下:


(x,y,z,t):

(0.11,0.12,1.00,0.41)

(0.34,0.43,1.00,0.92)

(0.01,0.25,1.00,0.65)

...

(0.71,0.32,1.00,0.49)

(0.31,0.22,1.00,0.01)

(0.21,0.13,1.00,0.71)


即,您会注意到训练数据在z 中都具有相同的冗余值,但是xyt 通常不是多余的。然而,我发现由于冗余,我的神经网络无法训练这些数据。特别是,每次我开始训练神经网络时,它似乎都失败了,损失函数变成了nan。但是,如果我改变神经网络的结构,使每一层的神经元数量为 [3, 20, 20, 20, ..., 1],即现在数据点只对应对于 (x, y, t) 的输入,一切正常,训练也正常。但是有没有办法克服这个问题呢? (注意:无论是否有任何变量相同,都会发生这种情况,例如 xyt 可能是多余的并导致此错误。 ) 我也尝试了不同的激活函数(例如 ReLU)并改变了网络中层数和神经元的数量,但这些改变并没有解决问题。

我的问题:有没有办法在保持冗余 z 作为输入的同时继续训练神经网络?碰巧我目前正在考虑的特定训练数据集所有 z 都是冗余的,但总的来说,我将来会有来自不同 z 的数据。因此,寻求一种方法来确保神经网络能够鲁棒地处理当前时刻的输入。

下面编码了一个最小的工作示例。运行此示例时,损失输出为 nan,但如果您简单地取消注释第 12 行中的 x_z 以确保 x_z 现在有变化,那么就没有问题了。但这不是解决方案,因为目标是使用具有所有常量值的原始 x_z

import numpy as np 
import tensorflow as tf

end_it = 10000 #number of iterations
frac_train = 1.0 #randomly sampled fraction of data to create training set
frac_sample_train = 0.1 #randomly sampled fraction of data from training set to train in batches
layers = [4, 20, 20, 20, 20, 20, 20, 20, 20, 1]
len_data = 10000
x_x = np.array([np.linspace(0.,1.,len_data)])
x_y = np.array([np.linspace(0.,1.,len_data)])
x_z = np.array([np.ones(len_data)*1.0])
#x_z = np.array([np.linspace(0.,1.,len_data)])
x_t = np.array([np.linspace(0.,1.,len_data)])
y_true = np.array([np.linspace(-1.,1.,len_data)])

N_train = int(frac_train*len_data)
idx = np.random.choice(len_data, N_train, replace=False)

x_train = x_x.T[idx,:]
y_train = x_y.T[idx,:]
z_train = x_z.T[idx,:]
t_train = x_t.T[idx,:]
v1_train = y_true.T[idx,:]

sample_batch_size = int(frac_sample_train*N_train)

np.random.seed(1234)
tf.set_random_seed(1234)
import logging
logging.getLogger('tensorflow').setLevel(logging.ERROR)
tf.logging.set_verbosity(tf.logging.ERROR)

class NeuralNet:
def __init__(self, x, y, z, t, v1, layers):
X = np.concatenate([x, y, z, t], 1)
self.lb = X.min(0)
self.ub = X.max(0)
self.X = X
self.x = X[:,0:1]
self.y = X[:,1:2]
self.z = X[:,2:3]
self.t = X[:,3:4]
self.v1 = v1
self.layers = layers
self.weights, self.biases = self.initialize_NN(layers)
self.sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=False,
log_device_placement=False))
self.x_tf = tf.placeholder(tf.float32, shape=[None, self.x.shape[1]])
self.y_tf = tf.placeholder(tf.float32, shape=[None, self.y.shape[1]])
self.z_tf = tf.placeholder(tf.float32, shape=[None, self.z.shape[1]])
self.t_tf = tf.placeholder(tf.float32, shape=[None, self.t.shape[1]])
self.v1_tf = tf.placeholder(tf.float32, shape=[None, self.v1.shape[1]])
self.v1_pred = self.net(self.x_tf, self.y_tf, self.z_tf, self.t_tf)
self.loss = tf.reduce_mean(tf.square(self.v1_tf - self.v1_pred))
self.optimizer = tf.contrib.opt.ScipyOptimizerInterface(self.loss,
method = 'L-BFGS-B',
options = {'maxiter': 50,
'maxfun': 50000,
'maxcor': 50,
'maxls': 50,
'ftol' : 1.0 * np.finfo(float).eps})
init = tf.global_variables_initializer()
self.sess.run(init)
def initialize_NN(self, layers):
weights = []
biases = []
num_layers = len(layers)
for l in range(0,num_layers-1):
W = self.xavier_init(size=[layers[l], layers[l+1]])
b = tf.Variable(tf.zeros([1,layers[l+1]], dtype=tf.float32), dtype=tf.float32)
weights.append(W)
biases.append(b)
return weights, biases
def xavier_init(self, size):
in_dim = size[0]
out_dim = size[1]
xavier_stddev = np.sqrt(2/(in_dim + out_dim))
return tf.Variable(tf.truncated_normal([in_dim, out_dim], stddev=xavier_stddev), dtype=tf.float32)
def neural_net(self, X, weights, biases):
num_layers = len(weights) + 1
H = 2.0*(X - self.lb)/(self.ub - self.lb) - 1.0
for l in range(0,num_layers-2):
W = weights[l]
b = biases[l]
H = tf.tanh(tf.add(tf.matmul(H, W), b))
W = weights[-1]
b = biases[-1]
Y = tf.add(tf.matmul(H, W), b)
return Y
def net(self, x, y, z, t):
v1_out = self.neural_net(tf.concat([x,y,z,t], 1), self.weights, self.biases)
v1 = v1_out[:,0:1]
return v1
def callback(self, loss):
global Nfeval
print(str(Nfeval)+' - Loss in loop: %.3e' % (loss))
Nfeval += 1
def fetch_minibatch(self, x_in, y_in, z_in, t_in, den_in, N_train_sample):
idx_batch = np.random.choice(len(x_in), N_train_sample, replace=False)
x_batch = x_in[idx_batch,:]
y_batch = y_in[idx_batch,:]
z_batch = z_in[idx_batch,:]
t_batch = t_in[idx_batch,:]
v1_batch = den_in[idx_batch,:]
return x_batch, y_batch, z_batch, t_batch, v1_batch
def train(self, end_it):
it = 0
while it < end_it:
x_res_batch, y_res_batch, z_res_batch, t_res_batch, v1_res_batch = self.fetch_minibatch(self.x, self.y, self.z, self.t, self.v1, sample_batch_size) # Fetch residual mini-batch
tf_dict = {self.x_tf: x_res_batch, self.y_tf: y_res_batch, self.z_tf: z_res_batch, self.t_tf: t_res_batch,
self.v1_tf: v1_res_batch}
self.optimizer.minimize(self.sess,
feed_dict = tf_dict,
fetches = [self.loss],
loss_callback = self.callback)
def predict(self, x_star, y_star, z_star, t_star):
tf_dict = {self.x_tf: x_star, self.y_tf: y_star, self.z_tf: z_star, self.t_tf: t_star}
v1_star = self.sess.run(self.v1_pred, tf_dict)
return v1_star

model = NeuralNet(x_train, y_train, z_train, t_train, v1_train, layers)

Nfeval = 1
model.train(end_it)

最佳答案

我认为你的问题出在这一行:

H = 2.0*(X - self.lb)/(self.ub - self.lb) - 1.0

X第三列,对应z变量,self.lbself.ub 是相同的值,并且等于示例中的值,在本例中为 1,因此实际上是在计算:

2.0*(1.0 - 1.0)/(1.0 - 1.0) - 1.0 = 2.0*0.0/0.0 - 1.0

这是 nan。您可以通过几种不同的方式解决这个问题,一个简单的选择是简单地做:

# Avoids dividing by zero
X_d = tf.math.maximum(self.ub - self.lb, 1e-6)
H = 2.0*(X - self.lb)/X_d - 1.0

关于tensorflow - 神经网络可以处理冗余输入吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61418788/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com