gpt4 book ai didi

sql - 我如何询问随时间变化的百分比

转载 作者:行者123 更新时间:2023-11-29 11:55:34 25 4
gpt4 key购买 nike

所以我需要使用 PostgreSQL 并询问有关 COUNT(DISTINCT userid) 的一天到 7 天之前的百分比变化。

这可能吗?

按天获取非重复用户非常简单:

SELECT COUNT(DISTINCT userid), timestamp::date 
FROM logs
GROUP BY timestamp::date
ORDER BY timestamp::date DESC

我如何将其转换为今天到 7 天前的百分比?

最佳答案

因此我们需要为第 X 天取一个值,为第 X - 7 天取第二个值,然后计算 %。
查询可能如下所示:

SELECT a.timestamp, 
a.cnt,
b.cnt cnt_minus_7_day,
round( 100.0 *( a.cnt - b.cnt ) / b.cnt , 2 ) change_7_days
from (
SELECT timestamp::date, COUNT(DISTINCT userid) cnt
FROM logs
GROUP BY timestamp::date
ORDER BY timestamp::date
) a
left join (
SELECT timestamp::date, COUNT(DISTINCT userid) cnt
FROM logs
GROUP BY timestamp::date
ORDER BY timestamp::date
) b
ON a.timestamp = b.timestamp - 7
;


你也可以尝试另一个版本 - 这个应该更快,
因为看起来 postgresql 不够聪明并且对同一个子查询求值两次,
而不是在内存或临时表中兑现结果。
WITH 子句有助于避免这种情况(比较下面的计划)。

with src as (
SELECT timestamp::date, COUNT(DISTINCT userid) cnt
FROM logs
GROUP BY timestamp::date
ORDER BY timestamp::date
)
SELECT a.timestamp,
a.cnt,
b.cnt cnt_minus_7_day,
round( 100.0 *( a.cnt - b.cnt ) / b.cnt , 2 ) change_7_days
FROM src a
left join src b
on a.timestamp = b.timestamp - 7


这是第一个查询的计划(在我的示例数据上运行):

"Hash Left Join  (cost=5136.71..5350.93 rows=101 width=20) (actual time=77.778..88.676 rows=101 loops=1)"
" Hash Cond: (public.logs."timestamp" = (b."timestamp" - 7))"
" -> GroupAggregate (cost=2462.13..2672.31 rows=101 width=8) (actual time=44.398..55.129 rows=101 loops=1)"
" -> Sort (cost=2462.13..2531.85 rows=27889 width=8) (actual time=44.290..48.392 rows=27889 loops=1)"
" Sort Key: public.logs."timestamp""
" Sort Method: external merge Disk: 488kB"
" -> Seq Scan on logs (cost=0.00..402.89 rows=27889 width=8) (actual time=0.037..10.396 rows=27889 loops=1)"
" -> Hash (cost=2673.32..2673.32 rows=101 width=12) (actual time=33.355..33.355 rows=101 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 5kB"
" -> Subquery Scan on b (cost=2462.13..2673.32 rows=101 width=12) (actual time=22.883..33.306 rows=101 loops=1)"
" -> GroupAggregate (cost=2462.13..2672.31 rows=101 width=8) (actual time=22.881..33.288 rows=101 loops=1)"
" -> Sort (cost=2462.13..2531.85 rows=27889 width=8) (actual time=22.817..26.507 rows=27889 loops=1)"
" Sort Key: public.logs."timestamp""
" Sort Method: external merge Disk: 488kB"
" -> Seq Scan on logs (cost=0.00..402.89 rows=27889 width=8) (actual time=0.014..3.696 rows=27889 loops=1)"
"Total runtime: 100.360 ms"

第二个版本:

"Hash Left Join  (cost=2675.59..2680.64 rows=101 width=20) (actual time=60.612..60.785 rows=101 loops=1)"
" Hash Cond: (a."timestamp" = (b."timestamp" - 7))"
" CTE src"
" -> GroupAggregate (cost=2462.13..2672.31 rows=101 width=8) (actual time=46.498..60.425 rows=101 loops=1)"
" -> Sort (cost=2462.13..2531.85 rows=27889 width=8) (actual time=46.382..51.113 rows=27889 loops=1)"
" Sort Key: logs."timestamp""
" Sort Method: external merge Disk: 488kB"
" -> Seq Scan on logs (cost=0.00..402.89 rows=27889 width=8) (actual time=0.037..8.945 rows=27889 loops=1)"
" -> CTE Scan on src a (cost=0.00..2.02 rows=101 width=12) (actual time=46.504..46.518 rows=101 loops=1)"
" -> Hash (cost=2.02..2.02 rows=101 width=12) (actual time=14.084..14.084 rows=101 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 5kB"
" -> CTE Scan on src b (cost=0.00..2.02 rows=101 width=12) (actual time=0.002..14.033 rows=101 loops=1)"
"Total runtime: 67.799 ms"

关于sql - 我如何询问随时间变化的百分比,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17983423/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com