gpt4 book ai didi

sql - Hive 连接或子查询混淆

转载 作者:可可西里 更新时间:2023-11-01 15:29:35 26 4
gpt4 key购买 nike

(SELECT
id,
SUM(hits / ab) AS HAB
FROM batting
GROUP BY id
) b

SELECT id, bmonth, bstate FROM master a

WHERE bmonth >= 0 AND bstate is NOT NULL
GROUP By bmonth,bstate

到目前为止,我有这些胡言乱语,但我迷失了如何形成连接然后继续。我不确定从哪里开始得到东西。我们应该加入还是使用子查询?请协助

在下面查找架构:

CREATE EXTERNAL TABLE IF NOT EXISTS batting
(id STRING, year INT, team STRING,
league STRING, games INT, ab INT, runs INT, hits INT, doubles INT, triples INT,
homeruns INT, rbi INT, sb INT, cs INT, walks INT, strikeouts INT, ibb INT,
hbp INT, sh INT, sf INT, gidp INT)
ROW FORMAT DELIMITED FIELDS
TERMINATED BY ',' LOCATION '/home/hduser/hivetest/batting';

CREATE EXTERNAL TABLE IF NOT EXISTS master
(id STRING, byear INT, bmonth INT, bday INT, bcountry STRING, bstate STRING,
bcity STRING, dyear INT, dmonth INT, dday INT, dcountry STRING, dstate STRING,
dcity STRING, fname STRING, lname STRING, name STRING, weight INT, height INT,
bats STRING, throws STRING, debut STRING, finalgame STRING, retro STRING,
bbref STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/home/hduser/hivetest/master';

最佳答案

首先确保至少 3 名玩家来自同一州和同一月。您必须从主表中获取集合。计算每个州/月的 ID 并过滤结果,其中 count(id) >=3

select bstate,bmonth from master
group by bstate,bmonth
having count(id) >=3

然后,您必须使用上述集合加入击球表,按月、州分组并按总和(命中)/总和(击球)排序并获得第一行。

select a.bmonth,a.bstate,SUM(c.hits)/SUM(b.bats) hb
from (select bmonth,bstate from master
group by bmonth,bstate
having count(id) >=3) a
join master b on a.bstate=b.state and a.month = b.month
join batting c on b.id = c.id
group by a.bmonth,a.bstate
order by hb
limit 1;

关于sql - Hive 连接或子查询混淆,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36853293/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com