gpt4 book ai didi

ClickHouse - ORDER BY WITH FILL - 月间隔

转载 作者:行者123 更新时间:2023-12-04 08:08:38 34 4
gpt4 key购买 nike

我有一个关于 FILL WITH 函数的问题。我需要一个按月分组的查询,其中包含空行以绘制在图表上。我使用 FILL WITH 函数。

我有一个简单的表格:

CREATE TABLE IF NOT EXISTS fillwith
(
`event_timestamp` Datetime64,
`event_date` Date,
`event_type` String
)
ENGINE = Memory

一些示例数据

insert into fillwith (event_timestamp, event_date, event_type) values ('2021-01-07 19:14:33.000', '2021-01-07', 'PRODUCT_VIEW');
insert into fillwith (event_timestamp, event_date, event_type) values ('2021-02-07 19:14:33.000', '2021-02-07', 'PRODUCT_CLICK');
insert into fillwith (event_timestamp, event_date, event_type) values ('2020-11-07 19:14:33.000', '2020-11-07', 'PRODUCT_VIEW');
insert into fillwith (event_timestamp, event_date, event_type) values ('2020-12-07 19:14:33.000', '2020-12-07', 'PRODUCT_VIEW');
insert into fillwith (event_timestamp, event_date, event_type) values ('2020-09-07 19:14:33.000', '2020-09-07', 'PRODUCT_VIEW');

以一天为间隔,我得到一个完整的天数列表,但没有排序,感觉它们是随机的天数

SELECT 
toDate(toStartOfInterval(event_date, toIntervalDay(1))) AS date,
countIf(event_type = 'PRODUCT_VIEW') AS views,
countIf(event_type = 'PRODUCT_CLICK') AS clicks
FROM fillwith
GROUP BY toDate(toStartOfInterval(event_date, toIntervalDay(1)))
ORDER BY date ASC
WITH FILL FROM toDate('2020-01-01') TO toDate('2021-12-01') STEP dateDiff('second', now(), now() + toIntervalDay(1))

结果:

┌───────date─┬─views─┬─clicks─┐
│ 2020-09-07 │ 1 │ 0 │
│ 2020-11-07 │ 1 │ 0 │
│ 2020-12-07 │ 1 │ 0 │
│ 2021-01-07 │ 1 │ 0 │
│ 2021-02-07 │ 0 │ 1 │
└────────────┴───────┴────────┘
┌───────date─┬─views─┬─clicks─┐
│ 2106-02-07 │ 0 │ 0 │
│ 2005-05-25 │ 0 │ 0 │
│ 2062-07-09 │ 0 │ 0 │
│ 2106-02-07 │ 0 │ 0 │
│ 1997-05-03 │ 0 │ 0 │
│ 2054-06-17 │ 0 │ 0 │
│ 2106-02-07 │ 0 │ 0 │
│ 1989-04-11 │ 0 │ 0 │
│ 2046-05-26 │ 0 │ 0 │
│ 2103-07-11 │ 0 │ 0 │

当我在一个月的时间间隔内尝试相同的操作时:

select 
toDate(toStartOfInterval(event_date, INTERVAL 1 month)) as date,
countIf(event_type = 'PRODUCT_VIEW') as views,
countIf(event_type = 'PRODUCT_CLICK') as clicks
from fillwith
GROUP BY toDate(toStartOfInterval(event_date, INTERVAL 1 month))
ORDER BY date ASC WITH FILL
FROM toDate('2020-01-01') TO toDate('2021-04-01') STEP dateDiff('second',
now(),
now() + INTERVAL 1 month)

结果:

┌───────date─┬─views─┬─clicks─┐
│ 2020-01-01 │ 0 │ 0 │
│ 2020-09-01 │ 1 │ 0 │
│ 2020-11-01 │ 1 │ 0 │
│ 2020-12-01 │ 1 │ 0 │
│ 2021-01-01 │ 1 │ 0 │
│ 2021-02-01 │ 0 │ 1 │
└────────────┴───────┴────────┘

但我希望:

┌───────date─┬─views─┬─clicks─┐
│ 2020-01-01 │ 0 │ 0 │
│ 2020-02-01 │ 0 │ 0 │
│ 2020-03-01 │ 0 │ 0 │
│ 2020-04-01 │ 0 │ 0 │
│ 2020-05-01 │ 0 │ 0 │
│ 2020-06-01 │ 0 │ 0 │
│ 2020-07-01 │ 0 │ 0 │
│ 2020-08-01 │ 0 │ 0 │
│ 2020-09-01 │ 1 │ 0 │
│ 2020-10-01 │ 0 │ 0 │
│ 2020-11-01 │ 1 │ 0 │
│ 2020-12-01 │ 1 │ 0 │
│ 2021-01-01 │ 1 │ 0 │
│ 2021-02-01 │ 0 │ 1 │
│ 2021-03-01 │ 0 │ 0 │
│ 2021-04-01 │ 0 │ 0 │
└────────────┴───────┴────────┘

有人知道为什么会这样吗?我该如何改进?

感谢您的帮助!

最佳答案

试试这个查询:

WITH toDate(0) AS start_date, toRelativeMonthNum(toDate(0)) AS relative_month_of_start_date
SELECT
addMonths(start_date, relative_month - relative_month_of_start_date) AS month,
views,
clicks
FROM
(
SELECT
toRelativeMonthNum(event_date) AS relative_month,
countIf(event_type = 'PRODUCT_VIEW') AS views,
countIf(event_type = 'PRODUCT_CLICK') AS clicks
FROM fillwith
GROUP BY relative_month
ORDER BY relative_month ASC
WITH FILL
FROM toRelativeMonthNum(toDate('2020-01-01'))
TO toRelativeMonthNum(toDate('2021-12-01')) STEP 1
)
ORDER BY month ASC

/*
┌──────month─┬─views─┬─clicks─┐
│ 2020-01-01 │ 0 │ 0 │
│ 2020-02-01 │ 0 │ 0 │
│ 2020-03-01 │ 0 │ 0 │
│ 2020-04-01 │ 0 │ 0 │
│ 2020-05-01 │ 0 │ 0 │
│ 2020-06-01 │ 0 │ 0 │
│ 2020-07-01 │ 0 │ 0 │
│ 2020-08-01 │ 0 │ 0 │
│ 2020-09-01 │ 1 │ 0 │
│ 2020-10-01 │ 0 │ 0 │
│ 2020-11-01 │ 1 │ 0 │
│ 2020-12-01 │ 1 │ 0 │
│ 2021-01-01 │ 1 │ 0 │
│ 2021-02-01 │ 0 │ 1 │
│ 2021-03-01 │ 0 │ 0 │
│ 2021-04-01 │ 0 │ 0 │
│ 2021-05-01 │ 0 │ 0 │
│ 2021-06-01 │ 0 │ 0 │
│ 2021-07-01 │ 0 │ 0 │
│ 2021-08-01 │ 0 │ 0 │
│ 2021-09-01 │ 0 │ 0 │
│ 2021-10-01 │ 0 │ 0 │
│ 2021-11-01 │ 0 │ 0 │
└────────────┴───────┴────────┘
*/

或替代方式:

SELECT
toStartOfMonth(date) AS month,
sum(views) AS views,
sum(clicks) AS clicks
FROM
(
SELECT
event_date AS date, /* or: toDate(toStartOfDay(event_timestamp)) AS date */
countIf(event_type = 'PRODUCT_VIEW') AS views,
countIf(event_type = 'PRODUCT_CLICK') AS clicks
FROM fillwith
GROUP BY date
ORDER BY date ASC
WITH FILL
FROM toDate('2020-01-01')
TO toDate('2021-12-01')
/* type of 'date' is Date => '1' means 1 day */
STEP 1
)
GROUP BY month
ORDER BY month ASC

/*
┌──────month─┬─views─┬─clicks─┐
│ 2020-01-01 │ 0 │ 0 │
│ 2020-02-01 │ 0 │ 0 │
│ 2020-03-01 │ 0 │ 0 │
│ 2020-04-01 │ 0 │ 0 │
│ 2020-05-01 │ 0 │ 0 │
│ 2020-06-01 │ 0 │ 0 │
│ 2020-07-01 │ 0 │ 0 │
│ 2020-08-01 │ 0 │ 0 │
│ 2020-09-01 │ 1 │ 0 │
│ 2020-10-01 │ 0 │ 0 │
│ 2020-11-01 │ 1 │ 0 │
│ 2020-12-01 │ 1 │ 0 │
│ 2021-01-01 │ 1 │ 0 │
│ 2021-02-01 │ 0 │ 1 │
│ 2021-03-01 │ 0 │ 0 │
│ 2021-04-01 │ 0 │ 0 │
│ 2021-05-01 │ 0 │ 0 │
│ 2021-06-01 │ 0 │ 0 │
│ 2021-07-01 │ 0 │ 0 │
│ 2021-08-01 │ 0 │ 0 │
│ 2021-09-01 │ 0 │ 0 │
│ 2021-10-01 │ 0 │ 0 │
│ 2021-11-01 │ 0 │ 0 │
└────────────┴───────┴────────┘
*/

关于ClickHouse - ORDER BY WITH FILL - 月间隔,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66092272/

34 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com