gpt4 book ai didi

用于分类特征的 TensorFlow 嵌入

转载 作者:行者123 更新时间:2023-12-01 03:23:30 26 4
gpt4 key购买 nike

在机器学习中,通常用 one-hot-encoding 表示分类(特别是:名义)特征。 .我正在尝试学习如何使用 tensorflow 的嵌入层来表示分类问题中的分类特征。我有 tensorflow version 1.01已安装,我正在使用 Python 3.6 .

我知道 tensorflow tutorial for word2vec ,但这对我的情况不是很有指导意义。在构建 tf.Graph 时,它使用 NCE 特定的权重和 tf.nn.nce_loss .

我只想要一个简单的前馈网络,如下所示,输入层是一个嵌入。我的尝试如下。由于形状不兼容,当我尝试将嵌入与隐藏层矩阵相乘时,它会提示。有什么想法可以解决这个问题吗?

from __future__ import print_function
import pandas as pd;
import tensorflow as tf
import numpy as np
from sklearn.preprocessing import LabelEncoder

if __name__ == '__main__':

# 1 categorical input feature and a binary output
df = pd.DataFrame({'cat2': np.array(['o', 'm', 'm', 'c', 'c', 'c', 'o', 'm', 'm', 'm']),
'label': np.array([0, 0, 1, 1, 0, 0, 1, 0, 1, 1])})

encoder = LabelEncoder()
encoder.fit(df.cat2.values)
X = encoder.transform(df.cat2.values)

Y = np.zeros((len(df), 2))
Y[np.arange(len(df)), df.label.values] = 1

# Neural net parameters
training_epochs = 5
learning_rate = 1e-3
cardinality = len(np.unique(X))
embedding_size = 2
input_X_size = 1
n_labels = len(np.unique(Y))
n_hidden = 10

# Placeholders for input, output
x = tf.placeholder(tf.int32, [None, 1], name="input_x")
y = tf.placeholder(tf.float32, [None, 2], name="input_y")

# Neural network weights
embeddings = tf.Variable(tf.random_uniform([cardinality, embedding_size], -1.0, 1.0))
h = tf.get_variable(name='h2', shape=[embedding_size, n_hidden],
initializer=tf.contrib.layers.xavier_initializer())
W_out = tf.get_variable(name='out_w', shape=[n_hidden, n_labels],
initializer=tf.contrib.layers.xavier_initializer())

# Neural network operations
embedded_chars = tf.nn.embedding_lookup(embeddings, x)

layer_1 = tf.matmul(embedded_chars,h)
layer_1 = tf.nn.relu(layer_1)
out_layer = tf.matmul(layer_1, W_out)

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out_layer, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
sess.run(init)

for epoch in range(training_epochs):
avg_cost = 0.

# Run optimization op (backprop) and cost op (to get loss value)
_, c = sess.run([optimizer, cost],
feed_dict={x: X, y: Y})
print("Optimization Finished!")

编辑:

请参阅下面的错误消息:
Traceback (most recent call last):
File "/home/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/common_shapes.py", line 671, in _call_cpp_shape_fn_impl
input_tensors_as_shapes, status)
File "/home/anaconda3/lib/python3.6/contextlib.py", line 89, in __exit__
next(self.gen)
File "/home/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 2 but is rank 3 for 'MatMul' (op: 'MatMul') with input shapes: [?,1,2], [2,10].

最佳答案

只需让您的x占位符大小 [None]而不是 [None, 1]

关于用于分类特征的 TensorFlow 嵌入,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43574889/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com