gpt4 book ai didi

hadoop - Pig Latin 中的聚合值

转载 作者:可可西里 更新时间:2023-11-01 16:51:33 26 4
gpt4 key购买 nike

在 Pig 中执行多级过滤后,我得到以下结果 -

(2343433,Argentina,2015,Sci-Fi)
(2343433,France,2015,Sci-Fi)
(2343433,Germany,2015,Sci-Fi)
(2343433,Netherlands,2015,Sci-Fi)
(2343433,Argentina,2015,Drama)
(2343433,France,2015,Drama)
(2343433,Germany,2015,Drama)
(2343433,Netherlands,2015,Drama)
(2343433,Argentina,2015,Family)
(2343433,France,2015,Family)
(2343433,Germany,2015,Family)
(2343433,Netherlands,2015,Family)

列名分别是movieid、country、year和genre。我需要汇总这些结果并生成类似这样的结果 -

(2343433,France,2015,Sci-Fi,Drama,Family)
(2343433,Germany,2015,Sci-Fi,Drama,Family)
(2343433,Netherlands,2015,Sci-Fi,Drama,Family)
(2343433,Argentina,2015,Sci-Fi,Drama,Family)

要么是那个,要么是这样的 -

 (2343433,France,Germany,Netherlands,Argentina,2015,Sci-Fi,Drama,Family)

下面是我获得上述结果的代码-

A = LOAD '/user/a1.csv' USING PigStorage('|') as (movie_id,movie_name,prod_year);
B = LOAD '/user/a2.csv' USING PigStorage('|') as (g_movieid,genres);
C = LOAD '/user/a3.csv' USING PigStorage('|') as (c_movieid,country_released);
D = JOIN A by movie_id, B by g_movieid;
E = JOIN D by g_movieid, C by c_movieid;
F = FOREACH E GENERATE movie_id,country,year,genre;

关于如何使用 Pig 实现此目的有什么想法吗?

最佳答案

试试这个,

Dump F;
(2343433,Argentina,2015,Sci-Fi)
(2343433,France,2015,Sci-Fi)
(2343433,Germany,2015,Sci-Fi)
(2343433,Netherlands,2015,Sci-Fi)
(2343433,Argentina,2015,Drama)
(2343433,France,2015,Drama)
(2343433,Germany,2015,Drama)
(2343433,Netherlands,2015,Drama)
(2343433,Argentina,2015,Family)
(2343433,France,2015,Family)
(2343433,Germany,2015,Family)
(2343433,Netherlands,2015,Family)


G = GROUP F BY (movie_id, country, year);
H = foreach G generate FLATTEN(group) as (movie_id, country, year), $1.$3 AS (genre:{T:(value:chararray)});
I = foreach H generate movie_id, country, year, FLATTEN(BagToTuple(genre.value));
Dump I;

(2343433,France,2015,Sci-Fi,Drama,Family)
(2343433,Germany,2015,Sci-Fi,Drama,Family)
(2343433,Argentina,2015,Sci-Fi,Drama,Family)
(2343433,Netherlands,2015,Sci-Fi,Drama,Family)

关于hadoop - Pig Latin 中的聚合值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32713436/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com