gpt4 book ai didi

hadoop - pig 中未命名列的总和

转载 作者:行者123 更新时间:2023-12-02 21:17:58 25 4
gpt4 key购买 nike

 shipnode,delivery_method ,<unnamed>
(9935,PICK,2)
(9960,PICK,2)
(9969,PICK,1)
(9963,SHP,1)
(9989,SHP,1)
(9995,SHP,1)
(9965,SHP,1)
(9995,SHP,1)

这是输出
 grunt> group_all_shipnode = GROUP
>> union_all
>> BY(
>> shipnode,delivery_method
>> )
>> ;

最后一列未命名,现在我要生成
作为shipnode和delivery_node的分组,并以第三列的总和为
 (9935,PICK,2)
(9960,PICK,2)
(9969,PICK,1)
(9963,SHP,1)
(9989,SHP,1)
(9995,SHP,2) <<------- sum of similar
(9965,SHP,1)

我正在尝试通过:
 grunt> sum_group_all_shipnode =FOREACH group_all_shipnode 
>> GENERATE FLATTEN(group) as(shipnode:chararray, delivery_method:chararray),
>> sum($1.$2);

产生错误:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve sum using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]

最佳答案

代替$ 1. $ 2,它必须是您的load语句中的关系。
例如,假设您正在将数据加载到关系A中。

A = LOAD 'data.csv' USING PigStorage(',');
group_all_shipnode = GROUP A BY ($1,$2);
sum_group_all_shipnode = FOREACH group_all_shipnode
GENERATE
FLATTEN(group) AS (shipnode:chararray, delivery_method:chararray),
SUM(A.$2);

关于hadoop - pig 中未命名列的总和,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38265870/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com