gpt4 book ai didi

python - 训练 TensorFlow 预测 csv 文件中的列

转载 作者:太空狗 更新时间:2023-10-29 20:27:20 27 4
gpt4 key购买 nike

我有在 csv 文件中构建的数据。我希望能够在给定所有其他列的情况下预测第 1 列是 1 还是 0。我如何着手训练程序(最好使用神经网络)以使用所有给定的数据来做出预测。有没有人可以告诉我的代码?我试过为它提供 numpy.ndarrayFIF0Que(抱歉,如果我拼错了)和 DataFrame;还没有任何效果。这是我在收到错误之前一直在运行的代码-

import tensorflow as tf
import numpy as np
from numpy import genfromtxt

data = genfromtxt('cs-training.csv',delimiter=',')

x = tf.placeholder("float", [None, 11])
W = tf.Variable(tf.zeros([11,2]))
b = tf.Variable(tf.zeros([2]))

y = tf.nn.softmax(tf.matmul(x,W) + b)
y_ = tf.placeholder("float", [None,2])

cross_entropy = -tf.reduce_sum(y_*tf.log(y))

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

init = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init)

for i in range(1000):
batch_xs, batch_ys = data.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

此时我遇到了这个错误-

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-128-b48741faa01b> in <module>()
1 for i in range(1000):
----> 2 batch_xs, batch_ys = data.train.next_batch(100)
3 sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

AttributeError: 'numpy.ndarray' object has no attribute 'train'

非常感谢任何帮助。我需要做的就是预测第 1 列是 1 还是 0。即使你所做的只是让我克服这个错误,我也应该能够从那里得到它。

编辑:这是我打印出来的 csv 的样子。

[[1,0.766126609,45,2,0.802982129,9120,13,0,6,0,2],
[0,0.957151019,40,0,0.121876201,2600,4,0,0,0,1],
[0,0.65818014,38,1,0.085113375,3042,2,1,0,0,0],
[0,0.233809776,30,0,0.036049682,3300,5,0,0,0,0]]

我正在尝试预测第一列。

最佳答案

以下内容从 CSV 文件读取并构建一个 tensorflow 程序。该示例使用 Iris 数据集,因为这可能是一个更有意义的示例。但是,它应该也适用于您的数据。

请注意,第一列将为 [0,1 或 2],因为有 3 种鸢尾花。

#!/usr/bin/env python
import tensorflow as tf
import numpy as np
from numpy import genfromtxt

# Build Example Data is CSV format, but use Iris data
from sklearn import datasets
from sklearn.cross_validation import train_test_split
import sklearn
def buildDataFromIris():
iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.33, random_state=42)
f=open('cs-training.csv','w')
for i,j in enumerate(X_train):
k=np.append(np.array(y_train[i]),j )
f.write(",".join([str(s) for s in k]) + '\n')
f.close()
f=open('cs-testing.csv','w')
for i,j in enumerate(X_test):
k=np.append(np.array(y_test[i]),j )
f.write(",".join([str(s) for s in k]) + '\n')
f.close()


# Convert to one hot
def convertOneHot(data):
y=np.array([int(i[0]) for i in data])
y_onehot=[0]*len(y)
for i,j in enumerate(y):
y_onehot[i]=[0]*(y.max() + 1)
y_onehot[i][j]=1
return (y,y_onehot)


buildDataFromIris()


data = genfromtxt('cs-training.csv',delimiter=',') # Training data
test_data = genfromtxt('cs-testing.csv',delimiter=',') # Test data

x_train=np.array([ i[1::] for i in data])
y_train,y_train_onehot = convertOneHot(data)

x_test=np.array([ i[1::] for i in test_data])
y_test,y_test_onehot = convertOneHot(test_data)


# A number of features, 4 in this example
# B = 3 species of Iris (setosa, virginica and versicolor)
A=data.shape[1]-1 # Number of features, Note first is y
B=len(y_train_onehot[0])
tf_in = tf.placeholder("float", [None, A]) # Features
tf_weight = tf.Variable(tf.zeros([A,B]))
tf_bias = tf.Variable(tf.zeros([B]))
tf_softmax = tf.nn.softmax(tf.matmul(tf_in,tf_weight) + tf_bias)

# Training via backpropagation
tf_softmax_correct = tf.placeholder("float", [None,B])
tf_cross_entropy = -tf.reduce_sum(tf_softmax_correct*tf.log(tf_softmax))

# Train using tf.train.GradientDescentOptimizer
tf_train_step = tf.train.GradientDescentOptimizer(0.01).minimize(tf_cross_entropy)

# Add accuracy checking nodes
tf_correct_prediction = tf.equal(tf.argmax(tf_softmax,1), tf.argmax(tf_softmax_correct,1))
tf_accuracy = tf.reduce_mean(tf.cast(tf_correct_prediction, "float"))

# Initialize and run
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

print("...")
# Run the training
for i in range(30):
sess.run(tf_train_step, feed_dict={tf_in: x_train, tf_softmax_correct: y_train_onehot})

# Print accuracy
result = sess.run(tf_accuracy, feed_dict={tf_in: x_test, tf_softmax_correct: y_test_onehot})
print "Run {},{}".format(i,result)


"""
Below is the ouput
...
Run 0,0.319999992847
Run 1,0.300000011921
Run 2,0.379999995232
Run 3,0.319999992847
Run 4,0.300000011921
Run 5,0.699999988079
Run 6,0.680000007153
Run 7,0.699999988079
Run 8,0.680000007153
Run 9,0.699999988079
Run 10,0.680000007153
Run 11,0.680000007153
Run 12,0.540000021458
Run 13,0.419999986887
Run 14,0.680000007153
Run 15,0.699999988079
Run 16,0.680000007153
Run 17,0.699999988079
Run 18,0.680000007153
Run 19,0.699999988079
Run 20,0.699999988079
Run 21,0.699999988079
Run 22,0.699999988079
Run 23,0.699999988079
Run 24,0.680000007153
Run 25,0.699999988079
Run 26,1.0
Run 27,0.819999992847
...

Ref:
https://gist.github.com/mchirico/bcc376fb336b73f24b29#file-tensorflowiriscsv-py
"""

希望对您有所帮助。

关于python - 训练 TensorFlow 预测 csv 文件中的列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33789485/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com