gpt4 book ai didi

hadoop - 带有子查询的查询的 Hive JOIN 需要永远

转载 作者:可可西里 更新时间:2023-11-01 16:54:15 24 4
gpt4 key购买 nike

最近我一直在玩 Hive。大多数事情都进展顺利,但是,当我尝试转换类似

2015-04-01   device1   traffic    other       start   
2015-04-01 device1 traffic violation deny
2015-04-01 device1 traffic violation deny
2015-04-02 device1 traffic other start
2015-04-03 device1 traffic other start
2015-04-03 device1 traffic other start

进入

2015-04-01   1       2
2015-04-02 1
2015-04-03 2

我尝试使用以下查询,但出于某种原因,无论我等待多长时间,查询的 reduce 阶段都停留在 96%。

SELECT pass.date, COUNT(pass.type), COUNT(deny.deny_type) FROM firewall_logs as pass
JOIN (
SELECT date, type as deny_type FROM firewall_logs
WHERE device = 'device1'
AND date LIKE '2015-04-%'
AND type = 'traffic' AND subtype = 'violation' and status = 'deny'
) deny ON ( pass.date = deny.date )
WHERE pass.device = 'device1'
AND pass.date LIKE '2015-04-%'
AND pass.type = 'traffic' AND pass.subtype = 'other' AND pass.status = 'start'
GROUP BY pass.date ORDER BY pass.date ;

所有 MR2 日志显示为:

2015-06-11 01:54:04,206 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 9028000 rows for join key [2015-04-26]
2015-06-11 01:54:04,423 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 9128000 rows for join key [2015-04-26]
2015-06-11 01:54:04,638 INFO [main] org.apache.hadoop.hive.ql.exec.CommonJoinOperator: table 0 has 9228000 rows for join key [2015-04-26]
2015-06-11 01:54:04,838 INFO [main] org.apache.hadoop.mapred.FileInputFormat: Total input paths to process : 1

有人知道为什么吗?

最佳答案

我尽量避免像瘟疫一样在 Hive 中 self 加入。您可以通过收集和创建 map 来做到这一点

add jar ./brickhouse-0.7.1.jar;
create temporary function collect as 'brickhouse.udf.collect.CollectUDAF';

select date
, c_map['start'] starts
, c_map['deny'] denies
from (
select date
, collect(status, c) c_map
from (
select date, status
, count( subtype ) c
from table
where device='device1' and type='traffic'
group by date, status ) x
group by date ) y

关于hadoop - 带有子查询的查询的 Hive JOIN 需要永远,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30761339/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com