gpt4 book ai didi

python - Django Postgres 内存泄漏

转载 作者:行者123 更新时间:2023-12-02 02:08:39 25 4
gpt4 key购买 nike

我有一个自定义 Django(v 2.0.0)命令以多线程方式启动后台作业执行器,这似乎给我带来了内存泄漏问题。

可以像这样启动命令:

./manage.py start_job_executer --thread=1

每个线程都有一个 while True 循环,用于从 PostgreSQL 表中获取作业。

为了自动完成作业并更改状态,我使用了事务:

# atomic transaction to temporary lock the db access and to
# get the most recent job from db with column status = pending
with transaction.atomic():
job = Job.objects.select_for_update() \
.filter(status=Job.STATUS['pending']) \
.order_by('created_at').first()
if job:
job.status = Job.STATUS['executing']
job.save()

看起来这个 Django 自定义命令分配的内存不断增长。

使用tracemalloc,我尝试通过创建一个检查内存分配的后台线程来查找导致内存泄漏的原因:

def check_memory(self):
while True:
s1 = tracemalloc.take_snapshot()
sleep(10)
s2 = tracemalloc.take_snapshot()
for alog in s2.compare_to(s1, 'lineno')[:10]:
log.info(alog)

找出如下日志:

01.04.20 13:50:06   operations.py:222: size=23.7 KiB (+23.7 KiB), count=66 (+66), average=367 B
01.04.20 13:50:36 operations.py:222: size=127 KiB (+43.7 KiB), count=353 (+122), average=367 B
01.04.20 13:51:04 operations.py:222: size=251 KiB (+66.7 KiB), count=699 (+186), average=367 B
01.04.20 13:51:31 operations.py:222: size=379 KiB (+68.9 KiB), count=1056 (+192), average=367 B
01.04.20 13:51:57 operations.py:222: size=495 KiB (+60.3 KiB), count=1380 (+168), average=367 B

看起来/usr/local/lib/python3.5/dist-packages/django/db/backends/postgresql/operations.py:222 没有释放内存

1 个线程的泄漏很慢,但如果我使用 8 个线程,内存泄漏会更糟:

01.04.20 13:07:51   operations.py:222: size=68.3 KiB (+68.3 KiB), count=191 (+191), average=366 B
01.04.20 13:08:56 operations.py:222: size=770 KiB (+140 KiB), count=2151 (+390), average=367 B
01.04.20 13:10:07 operations.py:222: size=1476 KiB (+138 KiB), count=4122 (+386), average=367 B

01.04.20 13:36:22 operations.py:222: size=17.3 MiB (+138 KiB), count=49506 (+385), average=367 B

01.04.20 13:48:16 operations.py:222: size=24.5 MiB (+136 KiB), count=69993 (+379), average=367 B

这是/usr/local/lib/python3.5/dist-packages/django/db/backends/postgresql/operations.py:222 中第 222 行的代码:

def last_executed_query(self, cursor, sql, params):
# http://initd.org/psycopg/docs/cursor.html#cursor.query
# The query attribute is a Psycopg extension to the DB API 2.0.
if cursor.query is not None:
return cursor.query.decode() # this is line 222!
return None

我不知道如何解决这个问题。有什么想法吗?

也发布在这里:https://code.djangoproject.com/ticket/31419#ticket

我想也许为每个需要执行的作业创建一个新进程,一旦完成,内存将被释放,进程本身就会死亡。这可能会起作用,但似乎有点矫枉过正。

提前致谢

更新

我使用的是 Django 2.0,我想更新到 Django 3.0.5(最新的稳定版本),但不幸的是问题仍然存在。

在新日志下方:

01.04.20 20:15:06   operations.py:235: size=977 KiB (+53.9 KiB), count=2750 (+152), average=364 B
01.04.20 20:15:28 operations.py:235: size=1070 KiB (+50.1 KiB), count=3012 (+141), average=364 B
01.04.20 20:15:53 operations.py:235: size=1156 KiB (+43.7 KiB), count=3255 (+123), average=364 B
01.04.20 20:16:19 operations.py:235: size=1245 KiB (+44.7 KiB), count=3507 (+126), average=364 B

01.04.20 20:20:23 operations.py:235: size=2154 KiB (+44.3 KiB), count=6065 (+125), average=364 B

最佳答案

settings.DEBUG = True 时,Django 在环形缓冲区中保留对所有已执行查询的引用

来自DEBUG ​documentation

It is also important to remember that when running with DEBUG turned on, Django will remember every SQL query it executes. This is useful when you’re debugging, but it’ll rapidly consume memory on a production server.

设置 DEBUG = False 应该可以解决您的问题。

在开发中可能出现问题的情况下删除环形缓冲区:

from django.db import reset_queries
if settings.DEBUG:
reset_queries()

关于python - Django Postgres 内存泄漏,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60972577/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com