gpt4 book ai didi

mysql - 如何为没有记录的日期优化匹配的日期范围表?

转载 作者:行者123 更新时间:2023-11-29 01:44:08 25 4
gpt4 key购买 nike

考虑下表:

tweets                        daterange
--------------------------- ----------------
tweet_id nyse_date class _date
--------------------------- ----------------
1 2011-03-12 2 2011-03-11
2 2011-03-12 1 2011-03-12
3 2011-03-12 1 2011-03-13
4 2011-03-12 1 2011-03-14
5 2011-03-12 0 2011-03-15
7 2011-03-13 1
8 2011-03-13 2
9 2011-03-13 3
10 2011-03-14 3

每条推文都分配了一个“类别”,可以是 1、2 或 3。我需要概览数据范围内每天每个类别的推文数量。因此,即使 2011-03-112011-03-15 上没有推文,我仍然需要将该日期包含在结果集中,如下所示:

nyse_date   total  class1  class2  class3
-----------------------------------------
2011-03-11 0 0 0 0
2011-03-12 5 3 1 0
2011-03-13 3 1 1 1
2011-03-14 1 0 0 1
2011-03-15 0 0 0 0

我尝试了以下查询,但它只是超时(它不应该因为数据库不是那么大):

SELECT
t.nyse_date,
COUNT(CASE WHEN t.nyse_date = d._date THEN 1 END) total,
SUM(t.class=1) as neu,
SUM(t.class=2) as pos,
SUM(t.class=3) as neg
FROM tweets t
CROSS JOIN
daterange d
GROUP BY t.nyse_date
ORDER BY t.nyse_date ASC

这是EXPLAIN

id select_type table type possible_keys key  key_len ref    rows    Extra
---------------------------------------------------------------------------------------------------
1 SIMPLE d ALL NULL NULL NULL NULL 148 Using temporary; Using filesort
1 SIMPLE t ALL NULL NULL NULL NULL 560783 Using join buffer

我做错了什么?是否有更有效的方法来确保包含 daterange 表中的所有日期?

edit:我也试过这个查询,但结果保持不变 - 它一直运行到超时。

SELECT 
t.nyse_date,
COUNT(t.tweet_id) AS total,
SUM(t.class=1) AS neu,
SUM(t.class=2) AS pos,
SUM(t.class=3) AS neg
FROM tweets t
LEFT JOIN
daterange d
ON t.nyse_date = d._date
GROUP BY t.nyse_date
ORDER BY t.nyse_date ASC

这是EXPLAIN

id select_type table type possible_keys key  key_len ref  rows    Extra
-------------------------------------------------------------------------------------------------
1 SIMPLE t ALL NULL NULL NULL NULL 560783 Using temporary; Using filesort
1 SIMPLE d ALL NULL NULL NULL NULL 148

最佳答案

您的查询运行缓慢的原因是因为它没有使用您的 tweets 表上的任何索引。

您要做的是在 tweets 表中的 (sp100_id, nyse_date) 列上创建一个复合索引,然后运行此查询:

SELECT     
a.sp100_id,
b._date,
COALESCE(c.total,0) AS total,
COALESCE(c.neu,0) AS neu,
COALESCE(c.pos,0) AS pos,
COALESCE(c.neg,0) AS neg,
COALESCE(c.spamneu,0) AS spamneu
FROM
sp100 a
CROSS JOIN
daterange b
LEFT JOIN
(
SELECT
sp100_id,
nyse_date,
COUNT(1) AS total,
COUNT(CASE class WHEN 1 THEN 1 END) AS neu,
COUNT(CASE class WHEN 2 THEN 1 END) AS pos,
COUNT(CASE class WHEN 3 THEN 1 END) AS neg,
COUNT(CASE WHEN class = 1 AND type = 1 THEN 1 END) AS spamneu
FROM tweets
GROUP BY sp100_id, nyse_date
) c ON
a.sp100_id = c.sp100_id AND b._date = c.nyse_date
ORDER BY
a.sp100_id, b._date

SQLFiddle Demo

关于mysql - 如何为没有记录的日期优化匹配的日期范围表?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11484793/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com