gpt4 book ai didi

tensorflow - 如何在 tensorflow 中将 TextVectorization 保存到磁盘?

转载 作者:行者123 更新时间:2023-12-03 10:08:27 25 4
gpt4 key购买 nike

from tensorflow.keras.layers.experimental.preprocessing import TextVectorization 

text_dataset = tf.data.Dataset.from_tensor_slices(text_clean)

vectorizer = TextVectorization(max_tokens=100000, output_mode='tf-idf',ngrams=None)

vectorizer.adapt(text_dataset.batch(1024))
我已经训练了一个 TextVectorization 并且我想将它保存到磁盘,以便我下次可以重新加载它?我试过 pickle 和joblib.dump。这是行不通的。
我怎样才能保存它?
生成的错误如下:
InvalidArgumentError: Cannot convert a Tensor of dtype resource to a NumPy array

最佳答案

不是 pickle 对象,而是 pickle 配置和权重。稍后解开它并使用配置来创建对象并加载保存的权重。办公文档 here .
代码

text_dataset = tf.data.Dataset.from_tensor_slices([
"this is some clean text",
"some more text",
"even some more text"])
# Fit a TextVectorization layer
vectorizer = TextVectorization(max_tokens=10, output_mode='tf-idf',ngrams=None)
vectorizer.adapt(text_dataset.batch(1024))

# Vector for word "this"
print (vectorizer("this"))

# Pickle the config and weights
pickle.dump({'config': vectorizer.get_config(),
'weights': vectorizer.get_weights()}
, open("tv_layer.pkl", "wb"))

print ("*"*10)
# Later you can unpickle and use
# `config` to create object and
# `weights` to load the trained weights.

from_disk = pickle.load(open("tv_layer.pkl", "rb"))
new_v = TextVectorization.from_config(from_disk['config'])
# You have to call `adapt` with some dummy data (BUG in Keras)
new_v.adapt(tf.data.Dataset.from_tensor_slices(["xyz"]))
new_v.set_weights(from_disk['weights'])

# Lets see the Vector for word "this"
print (new_v("this"))
输出:
tf.Tensor(
[[0. 0. 0. 0. 0.91629076 0.
0. 0. 0. 0. ]], shape=(1, 10), dtype=float32)
**********
tf.Tensor(
[[0. 0. 0. 0. 0.91629076 0.
0. 0. 0. 0. ]], shape=(1, 10), dtype=float32)

关于tensorflow - 如何在 tensorflow 中将 TextVectorization 保存到磁盘?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65103526/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com