gpt4 book ai didi

MySQL/memSQL 在 BETWEEN 连接条件下不使用索引

转载 作者:行者123 更新时间:2023-11-30 22:01:36 24 4
gpt4 key购买 nike

我们有两个表:

  • dates 表,包含过去 10 年和 future 10 年的每天一个日期。
  • states 表包含以下列:start_dateend_datestate

我们运行的查询如下所示:

SELECT dates.date, COUNT(*)
FROM dates
JOIN states
ON dates.date BETWEEN states.start_date AND states.end_date
WHERE dates.date BETWEEN '2017-01-01' AND '2017-01-31'
GROUP BY dates.date
ORDER BY dates.date;

根据查询计划,memSQL 没有在 JOIN 条件上使用索引,这使得查询变慢。有没有一种方法可以在 JOIN 条件上使用索引?

我们在 dates.date, states.start_date, states.end_date, (states.start_date, states.end_date) 上尝试了 memSQL skiplist 索引

表格和说明:

CREATE TABLE `dates` (
`date` date DEFAULT NULL,
KEY `date_index` (`date`)
)

CREATE TABLE `states` (
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
`state` varchar(256) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
KEY `start_date` (`start_date`),
KEY `end_date` (`end_date`),
KEY `start_date_end_date` (`start_date`,`end_date`),
)

+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| GatherMerge [remote_0.date] partitions:all est_rows:96 alias:remote_0 |
| Project [r2.date, CAST(COALESCE($0,0) AS SIGNED) AS `COUNT(*)`] est_rows:96 |
| Sort [r2.date] |
| HashGroupBy [SUM(r2.`COUNT(*)`) AS $0] groups:[r2.date] |
| TableScan r2 storage:list stream:no |
| Repartition [r1.date, `COUNT(*)`] AS r2 shard_key:[date] est_rows:96 est_select_cost:26764032 |
| HashGroupBy [COUNT(*) AS `COUNT(*)`] groups:[r1.date] |
| Filter [r1.date <= states.end_date] |
| NestedLoopJoin |
| |---IndexRangeScan drstates_test.states, KEY start_date (start_date) scan:[start_date <= r1.date] est_table_rows:123904 est_filtered:123904 |
| TableScan r1 storage:list stream:no |
| Broadcast [dates.date] AS r1 distribution:tree est_rows:96 |
| IndexRangeScan drstates_test.dates, KEY date_index (date) scan:[date >= '2017-01-01' AND date <= '2017-01-31'] est_table_rows:18628 est_filtered:96 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+

最佳答案

ON dates.date BETWEEN states.start_date
AND states.end_date

本质上是不可优化的。执行此测试的唯一实用方法是单调乏味地测试每一行。

如果您正在使用 MySQL 并且不需要dates 表,请考虑从

SELECT  *
FROM states
WHERE start_date >= '2017-01-01'
AND end_date < '2017-01-01' + INTERVAL 1 MONTH

请注意,这适用于 DATEDATETIME 数据类型的任意组合。

既然我不清楚最终目标,我也不清楚下一步该做什么。

关于MySQL/memSQL 在 BETWEEN 连接条件下不使用索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43181286/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com