gpt4 book ai didi

tensorflow 2.0 keras 将模型保存到 hdfs : Can't decrement id ref count

转载 作者:行者123 更新时间:2023-12-05 06:20:29 25 4
gpt4 key购买 nike

我已经通过 hdfs fuse 安装了一个 hdfs 驱动器,因此我可以通过路径 /hdfs/xxx 访问 hdfs。

用keras训练好模型后,想通过model.save("/hdfs/model.h5")保存到/hdfs/model.h5 .

我收到以下错误:

2020-02-26T10:06:51.83869705Z   File "h5py/_objects.pyx", line 193, in h5py._objects.ObjectID.__dealloc__
2020-02-26T10:06:51.838791107Z RuntimeError: Can't decrement id ref count (file write failed: time = Wed Feb 26 10:06:51 2020
2020-02-26T10:06:51.838796288Z , filename = '/hdfs/model.h5', file descriptor = 3, errno = 95, error message = 'Operation not supported', buf = 0x7f20d000ddc8, total write size = 512, bytes this sub-write = 512, bytes actually written = 18446744073709551615, offset = 298264)
2020-02-26T10:06:51.838802442Z Exception ignored in: 'h5py._objects.ObjectID.__dealloc__'
2020-02-26T10:06:51.838807122Z Traceback (most recent call last):
2020-02-26T10:06:51.838811833Z File "h5py/_objects.pyx", line 193, in h5py._objects.ObjectID.__dealloc__
2020-02-26T10:06:51.838816793Z RuntimeError: Can't decrement id ref count (file write failed: time = Wed Feb 26 10:06:51 2020
2020-02-26T10:06:51.838821942Z , filename = '/hdfs/model.h5', file descriptor = 3, errno = 95, error message = 'Operation not supported', buf = 0x7f20d000ddc8, total write size = 512, bytes this sub-write = 512, bytes actually written = 18446744073709551615, offset = 298264)
2020-02-26T10:06:51.838827917Z Traceback (most recent call last):
2020-02-26T10:06:51.838832755Z File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 117, in save_model_to_hdf5
2020-02-26T10:06:51.838838098Z f.flush()
2020-02-26T10:06:51.83885453Z File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 452, in flush
2020-02-26T10:06:51.838859816Z h5f.flush(self.id)
2020-02-26T10:06:51.838864401Z File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
2020-02-26T10:06:51.838869302Z File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
2020-02-26T10:06:51.838874126Z File "h5py/h5f.pyx", line 146, in h5py.h5f.flush
2020-02-26T10:06:51.838879016Z RuntimeError: Can't flush cache (file write failed: time = Wed Feb 26 10:06:51 2020
2020-02-26T10:06:51.838885827Z , filename = '/hdfs/model.h5', file descriptor = 3, errno = 95, error message = 'Operation not supported', buf = 0x4e5b018, total write size = 4, bytes this sub-write = 4, bytes actually written = 18446744073709551615, offset = 34552)

但是我可以直接写一个文件到同一个路径

with open("/hdfs/a.txt") as f:
f.write("1")

我还想出了一个棘手的解决方法并且它奏效了...

model.save("temp.h5")
move("temp.h5", "/hdfs/model.h5")

所以问题可能出在 keras api 上?只能将模型保存到本地,不能保存到hdfs路径。

知道如何解决这个问题吗?

最佳答案

我不认为 tensorflow 对能够保存到 hdfs-fuse 做出任何 promise 。您的(最终)错误不是“无法刷新缓存”,而是“无法减少 id 引用计数”,基本上意味着“无法直接保存到 hdfs-fuse”。但是,老实说,这对我来说似乎是固定的,您的解决方法很好。

关于tensorflow 2.0 keras 将模型保存到 hdfs : Can't decrement id ref count,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60428751/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com