gpt4 book ai didi

python - 用 Pandas 创建缓冲区时内存泄漏?

转载 作者:太空狗 更新时间:2023-10-30 01:34:21 24 4
gpt4 key购买 nike

我正在使用 pandas 做一个环形缓冲区,但内存使用量一直在增长。我做错了什么?

这是代码(根据问题的第一篇文章进行了一些编辑):

import pandas as pd
import numpy as np
import resource


tempdata = np.zeros((10000,3))
tdf = pd.DataFrame(data=tempdata, columns = ['a', 'b', 'c'])

i = 0
while True:
i += 1
littledf = pd.DataFrame(np.random.rand(1000, 3), columns = ['a', 'b', 'c'])
tdf = pd.concat([tdf[1000:], littledf], ignore_index = True)
del littledf
currentmemory = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
if i% 1000 == 0:
print 'total memory:%d kb' % (int(currentmemory)/1000)

这是我得到的:

total memory:37945 kb
total memory:38137 kb
total memory:38137 kb
total memory:38768 kb
total memory:38768 kb
total memory:38776 kb
total memory:38834 kb
total memory:38838 kb
total memory:38838 kb
total memory:38850 kb
total memory:38854 kb
total memory:38871 kb
total memory:38871 kb
total memory:38973 kb
total memory:38977 kb
total memory:38989 kb
total memory:38989 kb
total memory:38989 kb
total memory:39399 kb
total memory:39497 kb
total memory:39587 kb
total memory:39587 kb
total memory:39591 kb
total memory:39604 kb
total memory:39604 kb
total memory:39608 kb
total memory:39608 kb
total memory:39608 kb
total memory:39608 kb
total memory:39608 kb
total memory:39608 kb
total memory:39612 kb

不确定是否与此有关:

https://github.com/pydata/pandas/issues/2659

使用 Anaconda Python 在 MacBook Air 上测试

最佳答案

为什么不使用concat,而不是就地更新DataFramei % 10 将确定您将每个更新写入哪个 1000 行槽。

i = 0
while True:
i += 1
tdf.iloc[1000*(i % 10):1000+1000*(i % 10)] = np.random.rand(1000, 3)
currentmemory = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
if i% 1000 == 0:
print 'total memory:%d kb' % (int(currentmemory)/1000)

关于python - 用 Pandas 创建缓冲区时内存泄漏?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20726661/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com