gpt4 book ai didi

sql - 条件总和的左连接横向

转载 作者:行者123 更新时间:2023-11-29 11:56:08 26 4
gpt4 key购买 nike

我有一个包含客户、产品和类别的购买数据集。

customer     product     category    sales_value
A aerosol air_care 10
B aerosol air_care 12
C aerosol air_care 7
A perfume air_care 8
A perfume air_care 2
D perfume air_care 11
C burger food 13
D fries food 6
C fries food 9

我想要了解每种产品在该产品上花费的销售值(value)与至少购买该产品一次的客户在该产品类别上花费的销售值(value)之间的比率。

另一种说法:选取至少购买过一次薯条的客户,计算所有这些人的 A) 花在薯条上的销售值(value)总和以及B) 花在食品上的销售值(value)总和。

中间表的形式如下:

product    category  sum_spent_on_product           sum_spent_on_category    ratio
by_people_buying_product
aerosol air_care 29 39 0.74
perfume air_care 21 31 0.68
burger food 13 22 0.59
fries food 15 28 0.53

示例:至少购买过一次气雾剂的人在该产品上总共花费了 1800 美元。总体而言,这些人在 air_care 类别(aerosol 所属)上花费了 3600 美元。因此,气溶胶的比率为 0.5。

我尝试使用左连接横向来解决这个问题,并计算每个产品给定的中间结果,但我无法理解如何解决这个问题添加条件仅适用于购买此特定产品的客户:

select
distinct (product_id)
, category
, c.sales_category
from transactions t
left join lateral (
select
sum(sales_value) as sales_category
from transactions
where category = t.category
group by category
) c on true
;

以上查询列出了每种产品在产品类别上的支出总和,但没有所需的产品购买者条件。

左连接横向是正确的方法吗?普通 SQL 中还有其他解决方案吗?

最佳答案

我会使用窗口函数来计算每个客户在每个类别中的总支出:

SELECT
customer, product, category, sales_value,
sum(sales_value) OVER (PARTITION BY customer, category) AS tot_cat
FROM transactions;

customer | product | category | sales_value | tot_cat
----------+---------+----------+-------------+---------
A | aerosol | air_care | 10.00 | 20.00
A | perfume | air_care | 8.00 | 20.00
A | perfume | air_care | 2.00 | 20.00
B | aerosol | air_care | 12.00 | 12.00
C | aerosol | air_care | 7.00 | 7.00
C | fries | food | 9.00 | 22.00
C | burger | food | 13.00 | 22.00
D | perfume | air_care | 11.00 | 11.00
D | fries | food | 6.00 | 6.00

那么我们只需要总结一下。当顾客多次购买同一种产品时就会出现问题。在您的示例中,客户 A 购买了香水两次。为了解决这个问题,让我们同时按客户、产品和类别进行分组(并对 sales_value 列求和):

SELECT
customer, product, category, SUM(sales_value) AS sales_value,
SUM(SUM(sales_value)) OVER (PARTITION BY customer, category) AS tot_cat
FROM transactions
GROUP BY customer, product, category

customer | product | category | sales_value | tot_cat
----------+---------+----------+-------------+---------
A | aerosol | air_care | 10.00 | 20.00
A | perfume | air_care | 10.00 | 20.00 <-- this row summarizes rows 2 and 3 of previous result
B | aerosol | air_care | 12.00 | 12.00
C | aerosol | air_care | 7.00 | 7.00
C | burger | food | 13.00 | 22.00
C | fries | food | 9.00 | 22.00
D | perfume | air_care | 11.00 | 11.00
D | fries | food | 6.00 | 6.00

现在我们只需将 sales_value 和 tot_cat 相加即可得到中间结果表。我使用公共(public)表表达式来获取名为 t 的先前结果:

WITH t AS (
SELECT
customer, product, category, SUM(sales_value) AS sales_value,
SUM(SUM(sales_value)) OVER (PARTITION BY customer, category) AS tot_cat
FROM transactions
GROUP BY customer, product, category
)
SELECT
product, category,
sum(sales_value) AS sales_value, sum(tot_cat) AS tot_cat,
sum(sales_value) / sum(tot_cat) AS ratio
FROM t
GROUP BY product, category;

product | category | sales_value | tot_cat | ratio
---------+----------+-------------+---------+------------------------
aerosol | air_care | 29.00 | 39.00 | 0.74358974358974358974
fries | food | 15.00 | 28.00 | 0.53571428571428571429
burger | food | 13.00 | 22.00 | 0.59090909090909090909
perfume | air_care | 21.00 | 31.00 | 0.67741935483870967742

关于sql - 条件总和的左连接横向,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54852276/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com