gpt4 book ai didi

sql-server - SQL Server : Lead/Lag analytic function across groups (and not within groups)

转载 作者:行者123 更新时间:2023-12-02 12:08:45 27 4
gpt4 key购买 nike

很抱歉这篇文章很长,但我在下面提供了复制和粘贴示例数据以及可能的解决方案。 问题的相关部分位于帖子的上部(水平线上方)。

我有下表

 Dt          customer_id  buy_time     money_spent
-------------------------------------------------
2000-01-04 100 11:00:00.00 2
2000-01-05 100 16:00:00.00 1
2000-01-10 100 13:00:00.00 4
2000-01-10 100 14:00:00.00 3
2000-01-04 200 09:00:00.00 10
2000-01-06 200 10:00:00.00 11
2000-01-06 200 11:00:00.00 5
2000-01-10 200 08:00:00.00 20

并且想要一个查询来获取这个结果集

 Dt          Dt_next     customer_id  buy_time     money_spent
-------------------------------------------------------------
2000-01-04 2000-01-05 100 11:00:00.00 2
2000-01-05 2000-01-10 100 16:00:00.00 1
2000-01-10 NULL 100 13:00:00.00 4
2000-01-10 NULL 100 14:00:00.00 3
2000-01-04 2000-01-06 200 09:00:00.00 10
2000-01-06 2000-01-10 200 10:00:00.00 11
2000-01-06 2000-01-10 200 11:00:00.00 5
2000-01-10 NULL 200 08:00:00.00 20

也就是说:我想要每个客户 (customer_id) 和每一天 (Dt) 同一客户访问过的第二天 (Dt_next) >).

我已经有一个查询给出了后一个结果集(水平规则下方包含的数据和查询)。但是,它涉及一个左外连接和两个dense_rank聚合函数。这种方法对我来说似乎有点笨拙,我认为应该有更好的解决方案。 任何指向替代解决方案的指示都受到高度赞赏!谢谢!

顺便说一句:我使用的是 SQL Server 11,表中有 >>1m 条目。

<小时/>

我的查询:

 select
customer_table.Dt
,customer_table_lead.Dt as Dt_next
,customer_table.customer_id
,customer_table.buy_time
,customer_table.money_spent
from
(
select
#customer_data.*
,dense_rank() over (partition by customer_id order by customer_id asc, Dt asc) as Dt_int
from #customer_data
) as customer_table
left outer join
(
select distinct
#customer_data.Dt
,#customer_data.customer_id
,dense_rank() over (partition by customer_id order by customer_id asc, Dt asc)-1 as Dt_int
from #customer_data
) as customer_table_lead
on
(
customer_table.Dt_int=customer_table_lead.Dt_int
and customer_table.customer_id=customer_table_lead.customer_id
)

示例数据:

 create table #customer_data (
Dt date not null,
customer_id int not null,
buy_time time(2) not null,
money_spent float not null
);

insert into #customer_data values ('2000-01-04',100,'11:00:00',2);
insert into #customer_data values ('2000-01-05',100,'16:00:00',1);
insert into #customer_data values ('2000-01-10',100,'13:00:00',4);
insert into #customer_data values ('2000-01-10',100,'14:00:00',3);

insert into #customer_data values ('2000-01-04',200,'09:00:00',10);
insert into #customer_data values ('2000-01-06',200,'10:00:00',11);
insert into #customer_data values ('2000-01-06',200,'11:00:00',5);
insert into #customer_data values ('2000-01-10',200,'08:00:00',20);

最佳答案

尝试这个查询:

select cd.Dt
, t.Dt_next
, cd.customer_id
, cd.buy_time
, cd.money_spent
from (
select Dt
, LEAD(Dt) OVER (PARTITION BY customer_id ORDER BY Dt) AS Dt_next
, customer_id
from (
select distinct Dt, customer_id
from #customer_data
) t
) t
inner join #customer_data cd on t.customer_id = cd.customer_id and t.Dt = cd.Dt

为什么字段money_spent是float类型?您可能在计算方面遇到问题。将其转换为十进制类型。

关于sql-server - SQL Server : Lead/Lag analytic function across groups (and not within groups),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18915208/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com