gpt4 book ai didi

python - 就地更改python文件

转载 作者:太空狗 更新时间:2023-10-29 20:57:22 26 4
gpt4 key购买 nike

我有一个很大的 xml 文件 (40 Gb),我需要将其拆分成较小的 block 。我的工作空间有限,所以有没有办法在我将原始文件写入新文件时删除它们?

谢谢!

最佳答案

假设你想将文件分成 N 份,然后只需从文件的后面开始读取(或多或少)并重复调用 truncate :

Truncate the file's size. If the optional size argument is present, the file is truncated to (at most) that size. The size defaults to the current position. The current file position is not changed. ...

import os
import stat

BUF_SIZE = 4096
size = os.stat("large_file")[stat.ST_SIZE]
chunk_size = size // N
# or simply set a fixed chunk size based on your free disk space
c = 0

in_ = open("large_file", "r+")

while size > 0:
in_.seek(-min(size, chunk_size), 2)
# now you have to find a safe place to split the file at somehow
# just read forward until you found one
...
old_pos = in_.tell()
with open("small_chunk%2d" % (c, ), "w") as out:
b = in_.read(BUF_SIZE)
while len(b) > 0:
out.write(b)
b = in_.read(BUF_SIZE)
in_.truncate(old_pos)
size = old_pos
c += 1

小心,因为我没有测试任何这些。截断调用后可能需要调用 flush,我不知道文件系统实际释放空间的速度有多快。

关于python - 就地更改python文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1145286/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com