gpt4 book ai didi

mysql - 与原始 sql 相比,Django ORM 性能较差

转载 作者:可可西里 更新时间:2023-11-01 06:37:18 24 4
gpt4 key购买 nike

我正在使用 Django ORM 进行数据查询,我在这个表中得到了将近 200 万行。我试过了

app_count = App.objects.count()

from django.db import connection
cursor = connection.cursor()
cursor.execute('''SELECT count(*) FROM app''')

mysql slow_query日志给了我

Time: 2017-04-27T09:18:38.809498Z

User@Host: www[www] @ [172.19.0.3] Id: 5

Query_time: 4.107433 Lock_time: 0.004405 Rows_sent: 1 Rows_examined: 0

use app_platform; SET timestamp=1493284718; SELECT count(*) FROM app;

这个查询平均用时超过 4 秒,但是当我使用 mysql 客户端和 mysql shell 执行这个查询时

mysql> select count(*) from app;

+----------+
| count(*) |
+----------+
| 1870019 |
+----------+

1 row in set (0.41 sec)

只需 0.4 秒,10 倍的差异,为什么以及如何改进它。

编辑

这是我的模型

class AppMain(models.Model):
"""
"""
store = models.ForeignKey("AppStore", related_name="main_store")
name = models.CharField(max_length=256)
version = models.CharField(max_length=256, blank=True)
developer = models.CharField(db_index=True, max_length=256, blank=True)
md5 = models.CharField(max_length=256, blank=True)
type = models.CharField(max_length=256, blank=True)
size = models.IntegerField(blank=True)
download = models.CharField(max_length=1024, blank=True)
download_md5 = models.CharField(max_length=256, blank=True)
download_times = models.BigIntegerField(blank=True)
snapshot = models.CharField(max_length=2048, blank=True)
description = models.CharField(max_length=5000, blank=True)
app_update_time = models.DateTimeField(blank=True)
create_time = models.DateTimeField(db_index=True, auto_now_add=True)
update_time = models.DateTimeField(auto_now=True)

class Meta:
unique_together = ("store", "name", "version")

编辑2

我正在为我的项目使用 Docker 和 docker-compose

version: '2'
services:
mysqldb:
restart: always
image: mysql:latest
ports:
- "3306:3306"
environment:
MYSQL_ROOT_PASSWORD: just_for_test
MYSQL_USER: www
MYSQL_PASSWORD: www
MYSQL_DATABASE: app_platform
volumes:
- mysqldata:/var/lib/mysql
- ./config/:/etc/mysql/conf.d
- ./log/mysql/:/var/log/mysql/
web:
restart: always
build: ./app_platform/app_platform
env_file: .env
environment:
PYTHONPATH: '/usr/src/app/app_platform'
command: bash -c "gunicorn --chdir /usr/src/app/app_platform app_platform.wsgi:application -k gevent -w 6 -b :8000 --timeout 8000 --reload"
volumes:
- ./app_platform:/usr/src/app
- ./sqldata:/usr/src/sqldata
- /usr/src/app/static
ports:
- "8000"
dns:
- 114.114.114.114
- 8.8.8.8
links:
- mysqldb
nginx:
restart: always
build: ./nginx/
ports:
- "80:80"
volumes:
- ./app_platform:/usr/src/app
- ./nginx/sites-enabled/:/etc/nginx/sites-enabled
links:
- web:web
volumes:
mysqldata:

我的 django 设置如下所示:

import os
from django.utils.translation import ugettext_lazy as _

LANGUAGES = (
('en', _('English')),
('zh-CN', _('Chinese')),
)


LANGUAGE_CODE = 'zh-CN'

BASE_DIR = os.path.dirname(
os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

LOCALE_PATHS = (
os.path.join(BASE_DIR, "locale"),
)

# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = 'just_for_test'

INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'rest_framework',
'app_scrapy',
'app_user',
'app_api',
'app_check',
'common',
'debug_toolbar',
]


MIDDLEWARE_CLASSES = [
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'debug_toolbar.middleware.DebugToolbarMiddleware',
'django.middleware.locale.LocaleMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.auth.middleware.SessionAuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]

AUTH_USER_MODEL = 'app_user.MyUser'

AUTHENTICATION_BACKENDS = (
'app_user.models.CustomAuth', 'django.contrib.auth.backends.ModelBackend')


ROOT_URLCONF = 'app_platform.urls'


TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': ["/usr/src/app/app_platform/templates"],
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.template.context_processors.i18n',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages',
],
},
},
]

WSGI_APPLICATION = 'app_platform.wsgi.application'

LOGIN_REDIRECT_URL = '/'
LOGIN_URL = '/login/'
# Database
# https://docs.djangoproject.com/en/1.9/ref/settings/#databases
# Password validation
# https://docs.djangoproject.com/en/1.9/ref/settings/#auth-password-validators

AUTH_PASSWORD_VALIDATORS = [
{
'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
},
]

STATICFILES_FINDERS = (
'django.contrib.staticfiles.finders.FileSystemFinder',
'django.contrib.staticfiles.finders.AppDirectoriesFinder'
)

# Internationalization
# https://docs.djangoproject.com/en/1.9/topics/i18n/

TIME_ZONE = 'Asia/Shanghai'

USE_I18N = True

USE_L10N = True

USE_TZ = True


# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/1.9/howto/static-files/

STATIC_ROOT = "/static/"

STATIC_URL = '/static/'

STATICFILES_DIRS = (
'public/static/',
)


DEBUG = True

ALLOWED_HOSTS = []

REST_FRAMEWORK = {
'DEFAULT_AUTHENTICATION_CLASSES': (
'rest_framework.authentication.BasicAuthentication',
'rest_framework.authentication.SessionAuthentication',
),
'DEFAULT_PERMISSION_CLASSES': (
'rest_framework.permissions.AllowAny',
),
'DEFAULT_PAGINATION_CLASS':
'rest_framework.pagination.LimitOffsetPagination',
'PAGE_SIZE': 5,
}

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'app_platform',
'USER': 'www',
'PASSWORD': 'www',
'HOST': 'mysqldb', # Or an IP Address that your DB is hosted on
'PORT': '3306',
}
}

DEBUG_TOOLBAR_CONFIG = {
"SHOW_TOOLBAR_CALLBACK": lambda request: True,
}

我的应用表信息

CREATE TABLE `app` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(256) NOT NULL,
`version` varchar(256) NOT NULL,
`developer` varchar(256) NOT NULL,
`md5` varchar(256) NOT NULL,
`type` varchar(256) NOT NULL,
`size` int(11) NOT NULL,
`download` varchar(1024) NOT NULL,
`download_md5` varchar(256) NOT NULL,
`download_times` bigint(20) NOT NULL,
`snapshot` varchar(2048) NOT NULL,
`description` varchar(5000) NOT NULL,
`app_update_time` datetime(6) NOT NULL,
`create_time` datetime(6) NOT NULL,
`update_time` datetime(6) NOT NULL,
`store_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `app_store_id_6822fab1_uniq` (`store_id`,`name`,`version`),
KEY `app_7473547c` (`store_id`),
KEY `app_developer_b74bcd8e_uniq` (`developer`),
KEY `app_create_time_a071d977_uniq` (`create_time`),
CONSTRAINT `app_store_id_aef091c6_fk_app_scrapy_appstore_id` FOREIGN KEY (`store_id`) REFERENCES `app_scrapy_appstore` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1870020 DEFAULT CHARSET=utf8;

编辑 3

这里是 EXPLAIN SELECT COUNT(*) FROM app;

mysql> EXPLAIN SELECT COUNT(*) FROM `app`;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
1 row in set, 1 warning (0.00 sec)

编辑 4

这是我的mysql.cnf

innodb_read_io_threads=12
innodb_write_io_threads=12
innodb_io_capacity=300
innodb_read_io_threads=12
innodb_write_io_threads=12 #To stress the double write buffer
innodb_buffer_pool_size=3G
innodb_log_file_size = 32M #Small log files, more page flush
innodb_log_buffer_size=8M
innodb_flush_method=O_DIRECT

我的 docker 设置是 2 个 CPU 和 4GB 内存

编辑 5

当我在 django shell 中运行 ORM 查询时,只花了我 0.5-1 秒。所以问题是关于 docker 设置?或者 gunicorn 设置?

最佳答案

10X -- 我喜欢。这完全符合我的经验法则:“如果数据未缓存,查询将花费 10 倍于缓存时的时间。” ( Rick's RoTs )

但是,让我们继续讨论真正的问题:“4.1s 太慢了,我该怎么办。”

  • 更改您的应用,这样您就不需要行数了。您是否注意到搜索引擎不再说“out of 12345678 hits”?

  • 保持估计,而不是重新计算。

  • 让我们看看EXPLAIN SELECT COUNT(*) FROM app;它可能会提供更多线索。 (一个地方你说app,另一个地方你说app_scrapy_appmain;它们是一样的吗??)

  • 只要您从不DELETE任何行,这会给您相同的答案:SELECT MAX(id) FROM app,然后“立即”运行. (一旦发生DELETEROLLBACK等),id(s)就会丢失,所以COUNT将小于 MAX。)

更多

innodb_buffer_pool_size=3G 在只有 4GB 的 RAM 上可能太多了。如果 MySQL 交换,性能会变得非常糟糕。建议只有 2G,除非你能看到它没有交换。

注意:扫描 180 万行注定在该硬件或任何硬件上至少需要 0.4 秒。完成任务需要时间。此外,执行“长”查询会以两种方式干扰其他任务:它在执行查询时会消耗 CPU 和/或 I/O,而且它可能会将其他 block 从缓存中移出,从而导致它们变慢。所以,我真的认为“正确”的做法是注意我关于避免 COUNT(*) 的提示。这是另一个:

  • 建立并维护一个包含此(和其他)表格每日小计的“汇总表”。在其中包括每日 COUNT(*) 以及您可能想要的任何其他内容。通过使用此表中的 SUM(subtotal),这甚至可以缩短 0.4 秒的时间。 More on Summary Tables .

关于mysql - 与原始 sql 相比,Django ORM 性能较差,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43670770/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com