gpt4 book ai didi

python - Tensorflow tf.constant_initializer 非常慢

转载 作者:太空宇宙 更新时间:2023-11-03 12:02:36 24 4
gpt4 key购买 nike

尝试使用 100 dim 的预训练 word2vec 嵌入来训练 LSTM

@staticmethod
def load_embeddings(pre_trained_embeddings_path, word_embed_size):
embd = []
import time
start_time = time.time()
cnt = 4
with codecs.open(pre_trained_embeddings_path, mode="r", encoding='utf-8') as f:
for line in f.readlines():
values = line.strip().split(' ')
embd.append(values[1:])
cnt += 1
if cnt % 100000 == 0:
print("word-vectors loaded: %d" % cnt)

embedding, vocab_size, embed_dim = embd, len(embd), len(embd[0])

load_end_time = time.time()
print("word vectors loaded from and start initialising, cnt: %d, time taken: %d secs " % (vocab_size, load_end_time - start_time))

embedding_init = tf.constant_initializer(embedding, dtype=tf.float16)
src_word_embedding = tf.get_variable(shape=[vocab_size, embed_dim], initializer=embedding_init, trainable=False, name='word_embedding', dtype=tf.float16)

print("word-vectors loaded and initialised, cnt: %d, time taken: %d secs" % (vocab_size, time.time() - load_end_time))

return src_word_embedding

运行此方法时的输出如下:

word vectors loaded from and start initialising, cnt: 2419080, time taken: 74 secs
word-vectors loaded and initialised, cnt: 2419080, time taken: 1647 secs

系统信息:tensorflow 1.1.0,tcmalloc,python 3.6,ubuntu 14.04

半小时初始化似乎很慢还是正常行为?知道可能是什么问题吗?

更新:使用@sirfz 方法提供嵌入使得加载嵌入变得非常快初始化在 85 秒内完成

最佳答案

将大常量加载到图形中不仅速度较慢,而且还会泄漏大量内存。我有一个类似的问题 I reported not long ago对我来说最好的解决方法是:

# placeholder for loading your saved embeddings
embedding_init = tf.placeholder(tf.float16, shape=[vocab_size, embed_dim])
src_word_embedding = tf.get_variable(initializer=embedding_init, trainable=False, name='word_embedding', dtype=tf.float16)

# run initialization with the value of embeddings placeholder
session.run(tf.global_variables_initializer(), feed_dict={embedding_init: embedding})

关于python - Tensorflow tf.constant_initializer 非常慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44353509/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com