gpt4 book ai didi

python - csv 数据库的 Keras 索引越界错误

转载 作者:太空宇宙 更新时间:2023-11-04 02:46:30 26 4
gpt4 key购买 nike

这是我在使用该网站一段时间后第一次在 StackOverflow 上发帖。

我一直在尝试从此链接预测练习机器学习数据库的最后一列 http://archive.ics.uci.edu/ml/datasets/Diabetes+130-US+hospitals+for+years+1999-2008#

我运行下面的代码并收到此错误:

追溯(最近的调用最后):

文件“”,第 1 行,位于 runfile('/Users/ric4711/diabetes_tensorflow', wdir='/Users/ric4711')

运行文件中的文件“/Users/ric4711/anaconda/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py”,第 880 行 execfile(文件名,命名空间)

文件“/Users/ric4711/anaconda/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py”,第 94 行,在 execfile 中 builtins.execfile(filename, *where)

文件“/Users/ric4711/diabetes_tensorflow”,第 60 行,位于 y_train = to_categorical(y_train, num_classes = num_classes)

文件“/Users/ric4711/anaconda/lib/python2.7/site-packages/keras/utils/np_utils.py”,第 25 行,to_categorical 分类 [np.arange(n), y] = 1

IndexError:索引 3 超出尺寸为 3 的轴 1 的范围

我怀疑我的 y 轴尺寸或我为此管理类别的方式可能存在问题。任何帮助将不胜感激。

from pandas import read_csv
import numpy
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from sklearn.preprocessing import LabelEncoder
from keras.layers import Dense, Input
from keras.models import Model

dataset = read_csv(r"/Users/ric4711/Documents/dataset_diabetes/diabetic_data.csv", header=None)
#Column 2, 5, 10, 11, 18, 19, 20 all have "?"
#(101767, 50) size of dataset
#PROBLEM COLUMNS WITH NUMBER OF "?"
#2 2273
#5 98569
#10 40256
#11 49949
#18 21
#19 358
#20 1423
le=LabelEncoder()

dataset[[2,5,10,11,18,19,20]] = dataset[[2,5,10,11,18,19,20]].replace("?", numpy.NaN)

dataset = dataset.drop(dataset.columns[[0, 1, 5, 10, 11]], axis=1)
dataset.dropna(inplace=True)


y = dataset[[49]]
X = dataset.drop(dataset.columns[[44]], 1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

for col in X_test.columns.values:
if X_test[col].dtypes=='object':
data=X_train[col].append(X_test[col])
le.fit(data.values)
X_train[col]=le.transform(X_train[col])
X_test[col]=le.transform(X_test[col])

for col in y_test.columns.values:
if y_test[col].dtypes=='object':
data=y_train[col].append(y_test[col])
le.fit(data.values)
y_train[col]=le.transform(y_train[col])
y_test[col]=le.transform(y_test[col])


batch_size = 500
num_epochs = 300
hidden_size = 250

num_test = X_test.shape[0]
num_training = X_train.shape[0]
height, width, depth = 1, X_train.shape[1], 1
num_classes = 3

y_train = y_train.as_matrix()
y_test = y_test.as_matrix()

y_train = to_categorical(y_train, num_classes = num_classes)
y_test = to_categorical(y_test, num_classes = num_classes)

inp = Input(shape=(height * width,))
hidden_1 = Dense(hidden_size, activation='tanh')(inp)
hidden_2 = Dense(hidden_size, activation='tanh')(hidden_1)
hidden_3 = Dense(hidden_size, activation='tanh')(hidden_2)
hidden_4 = Dense(hidden_size, activation='tanh')(hidden_3)
hidden_5 = Dense(hidden_size, activation='tanh')(hidden_4)
hidden_6 = Dense(hidden_size, activation='tanh')(hidden_5)
hidden_7 = Dense(hidden_size, activation='tanh')(hidden_6)
hidden_8 = Dense(hidden_size, activation='tanh')(hidden_7)
hidden_9 = Dense(hidden_size, activation='tanh')(hidden_8)
hidden_10 = Dense(hidden_size, activation='tanh')(hidden_9)
hidden_11 = Dense(hidden_size, activation='tanh')(hidden_10)
out = Dense(num_classes, activation='softmax')(hidden_11)


model = Model(inputs=inp, outputs=out)

model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])


model.fit(X_train, y_train, batch_size = batch_size,epochs = num_epochs, validation_split = 0.1, shuffle = True)

model.evaluate(X_test, y_test, verbose=1)

最佳答案

我通过将 num_classes 更改为 4 并在 .fit 方法中应用 numpy.array(X_train)、numpy.array(y_train) 来修复此问题

关于python - csv 数据库的 Keras 索引越界错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45015755/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com