gpt4 book ai didi

Hadoop排名列

转载 作者:可可西里 更新时间:2023-11-01 15:18:14 25 4
gpt4 key购买 nike

所以我有这些我正在使用的特定列:customer_tokenmerchant_idmerchant_category_codetransaction_amount

我当前的查询是这样的:

SELECT customer_token, COUNT(transaction_amount), SUM(transaction_amount)
FROM transaction
WHERE file_date>20121031
and file_date<20121201
GROUP BY customer_token

我想在上面的查询中添加一个部分,在结果中,merchant_category_code 根据每个特定 merchant_category_code 中的交易金额分成不同的列。结果看起来像这样:

customer_token, count(transaction_amount),sum(transaction_amount), count(transaction_amount in merchant_category_code which is ranked 1), count(transaction_amount in merchant_category_code which is ranked 2), count(transaction_amount in merchant_category_code which is ranked 3), etc...

然后是:

customer_token, count(transaction_amount),sum(transaction_amount), sum(transaction_amount in merchant_category_code which is ranked 1), sum(transaction_amount in merchant_category_code which is ranked 2), sum(transaction_amount in merchant_category_code which is ranked 3), etc...

但我不知道如何做到这一点,或者是否有可能做到这一点。

最佳答案

如果您事先知道merchant_category_code 的可能值是什么,您可以使用CASE 表达式:

SELECT customer_token,
COUNT(transaction_amount),
SUM(transaction_amount),
COUNT(CASE WHEN merchant_category_code = 1 THEN transaction_amount END),
COUNT(CASE WHEN merchant_category_code = 2 THEN transaction_amount END),
COUNT(CASE WHEN merchant_category_code = 3 THEN transaction_amount END),
...
SUM(CASE WHEN merchant_category_code = 1 THEN transaction_amount END),
SUM(CASE WHEN merchant_category_code = 2 THEN transaction_amount END),
SUM(CASE WHEN merchant_category_code = 3 THEN transaction_amount END),
...
FROM transaction
WHERE file_date BETWEEN 20121101 AND 20121130
GROUP
BY customer_token
;

(或 IF 表达式,如果您愿意的话;有关这两者的文档,请参阅 the section titled "Conditional Functions" on the page "LanguageManual+UDF" in the Hive wiki)。

关于Hadoop排名列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13772353/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com