gpt4 book ai didi

clickhouse - 我可以查询clickhouse中累积列的每小时增量吗?

转载 作者:行者123 更新时间:2023-12-03 20:26:11 31 4
gpt4 key购买 nike

我想节省事件时间和每 30 秒产生的总电量。总金额不会每次都重置为零。这只是从仪表开始到现在的总数,而不是30秒内产生的总数。

有没有办法查询生成的电柱总量的每日、每周或每月聚合(也许不仅仅是 sum 或 avg)?

或者通过设计 AggregatingMergeTree table ?

我不需要保留每条记录,只需要每天、每周和每月的聚合。

例如 :

create table meter_record (
event_time Datetime,
generated_total Int64
)

最佳答案

更新

喜欢用SimpleAggregateFunction而不是 AggregateFunction对于像中值、平均、最小、最大这样的简单函数来加速聚合计算。

让我们建议您需要计算此表的中值、平均值和离差聚合:

CREATE TABLE meter_record (
event_time Datetime,
generated_total Int64
)
ENGINE = MergeTree
PARTITION BY (toYYYYMM(event_time))
ORDER BY (event_time);

使用 AggregatingMergeTree计算所需的聚合:

CREATE MATERIALIZED VIEW meter_aggregates_mv
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(day)
ORDER BY (day)
AS
SELECT
toDate(toStartOfDay(event_time)) AS day,
/* aggregates to calculate the day's section left and right endpoints */
minState(generated_total) min_generated_total,
maxState(generated_total) max_generated_total,
/* specific aggregates */
medianState(generated_total) AS totalMedian,
avgState(generated_total) AS totalAvg,
varPopState(generated_total) AS totalDispersion
/* ... */
FROM meter_record
GROUP BY day;

要获得所需的每日/每周/每月(以及任何基于日的聚合,如季度或年度)聚合,请使用以下查询:

/* daily report */
SELECT
day,
minMerge(min_generated_total) min_generated_total,
maxMerge(max_generated_total) max_generated_total,
medianMerge(totalMedian) AS totalMedian,
avgMerge(totalAvg) AS totalAvg,
varPopMerge(totalDispersion) AS totalDispersion
FROM meter_aggregates_mv
/*WHERE day >= '2019-02-05' and day < '2019-07-01'*/
GROUP BY day;

/* weekly report */
SELECT
toStartOfWeek(day, 1) monday,
minMerge(min_generated_total) min_generated_total,
maxMerge(max_generated_total) max_generated_total,
medianMerge(totalMedian) AS totalMedian,
avgMerge(totalAvg) AS totalAvg,
varPopMerge(totalDispersion) AS totalDispersion
FROM meter_aggregates_mv
/*WHERE day >= '2019-02-05' and day < '2019-07-01'*/
GROUP BY monday;

/* monthly report */
SELECT
toStartOfMonth(day) month,
minMerge(min_generated_total) min_generated_total,
maxMerge(max_generated_total) max_generated_total,
medianMerge(totalMedian) AS totalMedian,
avgMerge(totalAvg) AS totalAvg,
varPopMerge(totalDispersion) AS totalDispersion
FROM meter_aggregates_mv
/*WHERE day >= '2019-02-05' and day < '2019-07-01'*/
GROUP BY month;

/* get daily / weekly / monthly reports in one query (thanks @Denis Zhuravlev for advise) */
SELECT
day,
toStartOfWeek(day, 1) AS week,
toStartOfMonth(day) AS month,
minMerge(min_generated_total) min_generated_total,
maxMerge(max_generated_total) max_generated_total,
medianMerge(totalMedian) AS totalMedian,
avgMerge(totalAvg) AS totalAvg,
varPopMerge(totalDispersion) AS totalDispersion
FROM meter_aggregates_mv
/*WHERE (day >= '2019-05-01') AND (day < '2019-06-01')*/
GROUP BY month, week, day WITH ROLLUP
ORDER BY day, week, month;

评论:
  • 你指出原始数据不需要你只是聚合,所以你可以将 meter_record-table 的引擎设置为 Null ,手动清理 meter_record(见 DROP PARTITION)或定义 TTL自动执行
  • 删除原始数据是不好的做法,因为它无法计算历史数据的新聚合或恢复现有聚合等
  • 物化 View meter_aggregates_mv 将只包含在创建 View 后插入到表 meter_record 中的数据。要更改此行为,请使用 POPULATE View 定义
  • 关于clickhouse - 我可以查询clickhouse中累积列的每小时增量吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59314925/

    31 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com