mysql - 优化 SELECT COUNT(DISTINCT(col)) var, col2 var2 FROM table WHERE col< >'X' and col2 between 'Y' and 'Z' GROUP BY var2 ORDER BY var DESC;为了速度？-6ren

mysql - 优化 SELECT COUNT(DISTINCT(col)) var, col2 var2 FROM table WHERE col< >'X' and col2 between 'Y' and 'Z' GROUP BY var2 ORDER BY var DESC;为了速度？

转载作者：搜寻专家更新时间：2023-10-30 23:28:13

我有这个查询，它需要很长时间(大约 10 分钟)才能完成。

SELECT COUNT(DISTINCT(column)) var, 
       column2 var2 
FROM table 
WHERE column<>'X' and 
      column2 between 'Y' and 'Z' 
GROUP BY var2 
ORDER BY var DESC

有什么关于如何优化速度的想法吗？我尝试使用索引但仍然很慢。也许他们没有设置正确。 Y 和 Z 是时间戳，如果它重要的话，X 是这个查询根本不需要的东西，但在表中，因为来自同一应用程序的其他查询需要它。该表非常大 - 数百万行，而且还在增长。

编辑:这是示例的 EXPLAIN 结果:

    mysql> EXPLAIN SELECT COUNT(DISTINCT(ip)) v, geo n from idevaff_iptracking where geo<>'XX' and stamp between '1525122000' and '1543615199' group by n order by v desc;
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
| id   | select_type | table              | type  | possible_keys          | key          | key_len | ref  | rows    | Extra                                                     |
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
|    1 | SIMPLE      | idevaff_iptracking | range | stamp,geo,geo_stamp_ip | geo_stamp_ip | 9       | NULL | 3469323 | Using where; Using index; Using temporary; Using filesort |
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
1 row in set (0.00 sec)

表格位置如下:

id,acct_id,ip,refer,stamp,hit_time,hit_date,src1,src2,split,sub_id,tid1,tid2,tid3,tid4,target_url,geo.

索引如下:

    mysql> SHOW INDEX FROM idevaff_iptracking
    -> ;
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table              | Non_unique | Key_name           | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| idevaff_iptracking |          0 | PRIMARY            |            1 | id          | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_ip         |            1 | acct_id     | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_ip         |            2 | ip          | A         |     6775984 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | ip                 |            1 | ip          | A         |     6775984 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | stamp              |            1 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id            |            1 | acct_id     | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | geo                |            1 | geo         | A         |         440 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | tid1               |            1 | tid1        | A         |         276 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | tid2               |            1 | tid2        | A         |         514 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | tid3               |            1 | tid3        | A         |          34 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | tid4               |            1 | tid4        | A         |        5623 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_stamp_ip   |            1 | acct_id     | A         |         744 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_stamp_ip   |            2 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_stamp_ip   |            3 | ip          | A         |     6775984 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | geo_stamp_ip       |            1 | geo         | A         |       22362 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | geo_stamp_ip       |            2 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | geo_stamp_ip       |            3 | ip          | A         |     6775984 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid1_stamp |            1 | acct_id     | A         |         658 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid1_stamp |            2 | tid1        | A         |       11866 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid1_stamp |            3 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid2_stamp |            1 | acct_id     | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid2_stamp |            2 | tid2        | A         |       18666 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid2_stamp |            3 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid3_stamp |            1 | acct_id     | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid3_stamp |            2 | tid3        | A         |        1832 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid3_stamp |            3 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid4_stamp |            1 | acct_id     | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid4_stamp |            2 | tid4        | A         |        5060 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid4_stamp |            3 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
29 rows in set (0.00 sec)

最佳答案

添加这个复合索引:

INDEX(column2, column)

如果这还不够，我们需要查看 SHOW CREATE TABLE 以便进一步讨论。 (geo_stamp_ip 没那么好。)

跨列展开数组(tid 的)通常是错误的。

EXPLAIN FORMAT=JSON
SELECT  COUNT(DISTINCT ip) v, geo n
    from  idevaff_iptracking
    where  geo<>'XX'
      and  stamp between '1525122000' AND '1543615199'
    group by  n
    order by  v desc;

有些索引是多余的。通常，如果您有 INDEX(a,b)，则可以删除 INDEX(a)。 (例如:acct_id_ip)

关于mysql - 优化 SELECT COUNT(DISTINCT(col)) var, col2 var2 FROM table WHERE col< >'X' and col2 between 'Y' and 'Z' GROUP BY var2 ORDER BY var DESC;为了速度？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53412427/

文章推荐： mysql - 如何将两个表之间的比较记录结果插入到另一个表中？

文章推荐： java - Hibernate 组合多个查询

文章推荐： mysql - 我可以有一个使用不同表参数的触发器吗？

MySQL:如果第一个条目选择“Where Distinct”，则在查询中不“Distinct”
我有一个包含电子邮件、IP、州、城市、时间戳、ID 列的表我需要按州分组计算电子邮件和 IP 的不同位置所以当我运行 MYSQL 查询时， select State, City ,count(di
mysql - sql中select distinct id和select distinct *的区别
我试过 select distinct ID from DB.TABLE; 它返回所有记录中的唯一 ID。 select distinct * from DB.TABLE; 它将通过比较所有列
sql - 使用 DISTINCT 子句过滤数据但仍拉取其他非 DISTINCT 字段
我正在尝试在 Postgresql 中编写一个查询，该查询提取一组有序数据并按不同的字段对其进行过滤。我还需要从同一表行中提取其他几个字段，但需要将它们排除在不同的评估之外。示例: SELECT
sql - Postgres DISTINCT 与 DISTINCT ON 之间有什么区别？
我有一个使用以下语句创建的 Postgres 表。该表由另一个服务的数据转储填充。 CREATE TABLE data_table ( date date DEFAULT NULL,
mysql - 根据同一行中的另一个 DISTINCT 列获取 DISTINCT 列
我在一个名为 products 的表中有 4 列 id|p_name| p_img | 1 | Xs | xsmax.png | 2 | Xs | xr.png |
mysql - 在 DISTINCT 条件中选择 DISTINCT 列
当它的状态仅为"is"时，我想从“num”中选择不同的值，而不是立即包括“否”？表: +--------+-----+--------+ | id | num | status | +---
php - 如何同时使用 DISTINCT 行和非 DISTINCT 行
全部!今天我有一个棘手的问题要给你，我想使用 select DISTINCT 语句来选择一个需要不同的行，但也在同一个语句中(或者我尝试过的方式？)一个没有的行't/不能区分。我想要的结果是每个类名中
c# - IQueryable.Distinct() 与 List.Distinct()
我有一个正在使用 Distinct() 的 linq 查询。如果我只是调用 Distinct() 而没有转换为列表，那么它不会返回不同的列表 - 它仍然包含重复项。但是，如果我转换为 List 并然
linq - 我应该使用 .ToList().Distinct() 还是 .Distinct().ToList()？
说到性能，我应该使用 .ToList().Distinct() 还是 .Distinct().ToList() ？两种扩展方法是否生成相同的 SQL 查询？看起来第二种方法应该表现更好，但这是真的
sql - 如何在SQL Server 20008R2中重写IS DISTINCT FROM和IS NOT DISTINCT FROM？
如何在不支持 SQL Server 2008R2 的 SQL 实现中重写包含标准 IS DISTINCT FROM 和 IS NOT DISTINCT FROM 运算符的表达式？最佳答案 IS DI
mysql - 为什么 Distinct * 不起作用但 count(Distinct *) 起作用？
有一张 table (在 HIVE) 示例 - meanalytics.key2_master_ids 该表有 6 列(cmpgn_id、offr_id、exec_id、creatv_id、cmpl_
mysql-workbench - 如何将 DISTINCT 数据导出到 DISTINCT 文件
SELECT * FROM `amc_info` WHERE department =' ( SELECT DISTINCT department ) into outfile = 'Differe
elasticsearch - 在Elasticsearch中可以计算 “distinct sum”和 “distinct average”吗？
如何在Elasticsearch中计算“不同的平均值”？我有一些这样的非规范化数据: { "record_id" : "100", "cost" : 42 } { "record_id" : "200
sql-server - 在一列上选择 Distinct 并消除 Select Distinct 中的空值？
关注这个question我有... ID SKU PRODUCT ======================= 1 FOO-23 Orange 2 BAR
mysql - 为什么 DISTINCT 使这个查询比没有 DISTINCT 花费的时间长 10 倍？
我有这个 mysql 查询: SELECT DISTINCT post.postId,hash,previewUrl,lastRetrieved FROM post INNER JOIN (tag a
mysql - 我们可以对 group_concat(distinct somefield) 做一个 DISTINCT 吗？
http://sqlfiddle.com/#!2/37dd94/17 如果我执行 SELECT DISTINCT，我得到的结果与只执行 SELECT 的结果相同。在查询结果中，您将看到两个包含 Di
mysql - func.count(distinct(...)) 不会给出与 distinct().count() 相同的结果
我有一列包含空条目，例如此列中的可能值为 None, 1, 2, 3 当我使用 session.query(func.count(distinct(Entry.col))).scalar() 计算列中
php - 在 mysql 中选择 distinct 和 count distinct
这是否可能从表列中选择不同的行并计算单个查询中每个不同字段的重复行 $sql = "SELECT DISTINCT location and COUNT(DISTINCT location)
mysql - count(distinct col_name) 与计算 select distinct 查询的行数不同吗？
我在 MySQL 数据库中有一个包含 1100 万行的表。其中一列是个人身份证号码。人们在表中被多次列出，我想知道有多少个唯一的个人 ID 号码。然后创建一个包含这些唯一数字的表格。当我计算列中不同的
sql - 为什么 SELECT DISTINCT 返回的行数与 COUNT(DISTINCT) 不同？
我刚刚注意到我的 Informix SQL 列(在同一个表中)的某些上有些奇怪。当我执行此查询时 SELECT DISTINCT colName FROM myTable 例如，我得到 40 行。但

搜寻专家

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

mysql - 优化 SELECT COUNT(DISTINCT(col)) var, col2 var2 FROM table WHERE col< >'X' and col2 between 'Y' and 'Z' GROUP BY var2 ORDER BY var DESC;为了速度？