gpt4 book ai didi

mysql - 如何优化此查询?运行需要 3 分钟

转载 作者:太空宇宙 更新时间:2023-11-03 10:37:35 25 4
gpt4 key购买 nike

我有一个包含 3 个表的数据库。

  1. 日历 表,2000-01-012040-01-01 之间的每个日期都有一行,总共 14610 行
  2. locations 表,每个位置都有一个 idname,共 12 行
  3. receipts 表,其中包含 iddatetime,以及其他几个不相关的字段,总计约 250,000 行

我正在尝试获取按位置分组的日期范围之间每一天的收据计数,如果不存在收据,则计数为零。

我有一个有效的查询,但需要大约 3 分钟才能运行:

SELECT
`locations`.`name` AS `location`,
`calendar`.`date` AS `date`,
COUNT(`receipts`.`id`) AS `count`
FROM `locations`
CROSS JOIN `calendar`
LEFT JOIN `receipts` ON `calendar`.`date` = DATE(`receipts`.`datetime`)
AND `locations`.`id` = UPPER(LEFT(`receipts`.`id`, 1)) # there is no `location_id` FK. First char of receipts id is same as location id
WHERE `calendar`.`date` >= '2017-04-01' AND `calendar`.`date` <= '2017-04-07'
GROUP BY `locations`.`id`, `calendar`.`id`
ORDER BY `locations`.`name` ASC, `calendar`.`date` ASC;

我相信它与 WHERE 语句有关。

我将 WHERE 更改为此,它会立即运行,但它不再为我提供无收据的零计数:

SELECT
`locations`.`name` AS `location`,
`calendar`.`date` AS `date`,
COUNT(`receipts`.`id`) AS `count`
FROM `locations`
CROSS JOIN `calendar`
LEFT JOIN `receipts` ON `calendar`.`date` = DATE(`receipts`.`datetime`)
AND `locations`.`id` = UPPER(LEFT(`receipts`.`id`, 1)) # there is no `location_id` FK. First char of receipts id is same as location id
WHERE DATE(`receipts`.`datetime`) >= '2017-04-01' AND DATE(`receipts`.`datetime`) <= '2017-04-07'
GROUP BY `locations`.`id`, `calendar`.`id`
ORDER BY `locations`.`name` ASC, `calendar`.`date` ASC;

然后我开始搞乱子查询但没有成功:

SELECT
`locations`.`name` AS `location`,
`cal`.`date` AS `date`,
COUNT(`receipts`.`id`) AS `count`
FROM `locations`
CROSS JOIN (
SELECT `calendar`.`id`, `calendar`.`date`
FROM `calendar`
WHERE `calendar`.`date` >= '2017-04-01' AND `calendar`.`date` <= '2017-04-07'
) `cal`
LEFT JOIN `receipts` ON `cal`.`date` = DATE(`receipts`.`datetime`)
AND `locations`.`id` = UPPER(LEFT(`receipts`.`id`, 1)) # there is no `location_id` FK. First char of receipts id is same as location id
WHERE DATE(`receipts`.`datetime`) >= '2017-04-01' AND DATE(`receipts`.`datetime`) <= '2017-04-07'
GROUP BY `locations`.`id`, `cal`.`id`
ORDER BY `locations`.`name` ASC, `cal`.`date` ASC;

无论如何,我可以加快第一个查询的速度,因为这是给我想要的输出的查询?

最佳答案

试试这个:

SELECT l.name location, c.date, COUNT(r.id) count
FROM calendar c
left join calendar n on n.Date = c.Date + 1 -- one day after c.date
left join (locations l join receipts r
on r.id like '%' + l.Id)
on r.datetime between c.Date and n.Date
where c.Date between '2017-04-01' and '2017-04-07'
GROUP BY l.id, c.id
ORDER BY l.name, c.date;

您的问题是由于:
1.您使用的是不必要的交叉连接。交叉连接创建笛卡尔积(一侧的每一行都与另一侧的每一行组合。)因此,将字母表与 10 位数字交叉连接将产生 260 行,{A0, A1, A2...A9, B1, B2, ....B9 ...等
2. SQL 查询中有多个(尽管一个就足够了)构造导致查询处理器必须从磁盘读取表的每一行,有效地阻止它使用可能在 table 。对表中的列值使用函数进行筛选(where 子句)或排序(Order by 子句)这样做是因为查询处理器在不执行函数的情况下无法知道函数值是什么,并且它必须读取磁盘上主表中的行以获取执行函数的基础值。如果它只是原始列值,并且该列在索引中,则处理器不需要读取主数据表,它可以只遍历索引,这通常是相当小的大小并且需要小得多的数量磁盘 IO。

这称为 SARGable .

如果 c.Date + 1 在 MySQL 中是不可能的,那么试试这个:

SELECT l.name location, c.date, COUNT(r.id) count
FROM calendar c
left join calendar n on n.Date =
(Select min(date) from Calendar -- subquery gets the next day in calendar
Where date > c.Date)
left join (locations l join receipts r
on r.id like '%' + l.Id)
on r.datetime between c.Date and n.Date
where c.Date between '2017-04-01' and '2017-04-07'
GROUP BY l.id, c.id
ORDER BY l.name, c.date;

关于mysql - 如何优化此查询?运行需要 3 分钟,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45192462/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com