gpt4 book ai didi

python - 使用 Python 中的 Keras 和 TensorFlow 无法重现结果

转载 作者:太空狗 更新时间:2023-10-29 21:06:33 25 4
gpt4 key购买 nike

我遇到了问题,我无法使用 Keras 和 ThensorFlow 重现我的结果。

似乎最近在 Keras documentation site 上发布了一个解决方法对于这个问题,但不知何故它对我不起作用。

我做错了什么?

我在 MBP Retina(没有 Nvidia GPU)上使用 Jupyter Notebook。

# ** Workaround from Keras Documentation **

import numpy as np
import tensorflow as tf
import random as rn

# The below is necessary in Python 3.2.3 onwards to
# have reproducible behavior for certain hash-based operations.
# See these references for further details:
# https://docs.python.org/3.4/using/cmdline.html#envvar-PYTHONHASHSEED
# https://github.com/fchollet/keras/issues/2280#issuecomment-306959926

import os
os.environ['PYTHONHASHSEED'] = '0'

# The below is necessary for starting Numpy generated random numbers
# in a well-defined initial state.

np.random.seed(42)

# The below is necessary for starting core Python generated random numbers
# in a well-defined state.

rn.seed(12345)

# Force TensorFlow to use single thread.
# Multiple threads are a potential source of
# non-reproducible results.
# For further details, see: https://stackoverflow.com/questions/42022950/which-seeds-have-to-be-set-where-to-realize-100-reproducibility-of-training-res

session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)

from keras import backend as K

# The below tf.set_random_seed() will make random number generation
# in the TensorFlow backend have a well-defined initial state.
# For further details, see: https://www.tensorflow.org/api_docs/python/tf/set_random_seed

tf.set_random_seed(1234)

sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)


# ** Workaround end **

# ** Start of my code **


# LSTM and CNN for sequence classification in the IMDB dataset
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from sklearn import metrics
# fix random seed for reproducibility
#np.random.seed(7)

# ... importing data and so on ...

# create the model
embedding_vecor_length = 32
neurons = 91
epochs = 1
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
model.add(LSTM(neurons))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_logarithmic_error', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=epochs, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

使用的 Python 版本:

Python 3.6.3 |Anaconda custom (x86_64)| (default, Oct  6 2017, 12:04:38) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]

解决方法已包含在代码中(无效)。

每次我进行训练部分时,我都会得到不同的结果。

重置Jupyter Notebook内核时,1st时间对应第一次,2nd时间对应2nd时间。

所以在重置之后,我总是会在第一次运行时得到例如 0.7782,在第二次运行时得到 0.7732 等等。

但是每次我运行它时,没有内核重置的结果总是不同的。

我会提供任何建议!

最佳答案

我遇到了完全相同的问题,并在每次运行模型时通过关闭并重新启动 tensorflow session 设法解决了这个问题。在您的情况下,它应该如下所示:

#START A NEW TF SESSION
np.random.seed(0)
tf.set_random_seed(0)
sess = tf.Session(graph=tf.get_default_graph())
K.set_session(sess)

embedding_vecor_length = 32
neurons = 91
epochs = 1
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
model.add(LSTM(neurons))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_logarithmic_error', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=epochs, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

#CLOSE TF SESSION
K.clear_session()

我运行了以下代码并使用 GPU 和 tensorflow 后端获得了可重现的结果:

print datetime.now()
for i in range(10):
np.random.seed(0)
tf.set_random_seed(0)
sess = tf.Session(graph=tf.get_default_graph())
K.set_session(sess)

n_classes = 3
n_epochs = 20
batch_size = 128

task = Input(shape = x.shape[1:])
h = Dense(100, activation='relu', name='shared')(task)
h1= Dense(100, activation='relu', name='single1')(h)
output1 = Dense(n_classes, activation='softmax')(h1)

model = Model(task, output1)
model.compile(loss='categorical_crossentropy', optimizer='Adam')
model.fit(x_train, y_train_onehot, batch_size = batch_size, epochs=n_epochs, verbose=0)
print(model.evaluate(x=x_test, y=y_test_onehot, batch_size=batch_size, verbose=0))
K.clear_session()

并获得了这个输出:

2017-10-23 11:27:14.494482
0.489712882132
0.489712893813
0.489712892765
0.489712854426
0.489712882132
0.489712864011
0.486303713004
0.489712903398
0.489712892765
0.489712903398

我的理解是,如果您不关闭 tf session (您通过在新内核中运行来完成),您将继续对相同的“种子”分布进行采样。

关于python - 使用 Python 中的 Keras 和 TensorFlow 无法重现结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46836857/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com