gpt4 book ai didi

hadoop - PIG 拉丁文脚本 - 使用组和 TOBAG

转载 作者:可可西里 更新时间:2023-11-01 15:59:12 29 4
gpt4 key购买 nike

我有一个包含以下内容的文件

输入:

TOYID;TOYSeries;ModuleID;ID;PART_NUMBER;SUPPLIER;LAND
394107;C204; 731305; 69807402;A0001532122;ABC;AT
394107;C204; 731307; 69807402;A0001532122;ABC;AT
394107;C204; 731315; 69807402;A0001532122;ABC;AT
394107;C204; 731325; 69807402;A0001532122;ABC;AT
394107;C204; 731335; 69807402;A0001532122;ABC;AT
394107;C204; 731345; 69807402;A0001532122;ABC;AT

我想要这样的输出输出:

SUPPLIER;LAND; COUNT(SUPPLIER,LAND);  TOYID         TOYSeries;   ModuleID;   ID;          PART_NUMBER
ABC;AT; 6 ; 394107 C204; 731305; 69807402; A0001532122
ABC;AT 6 ; 394107 C204; 731307; 69807402; A0001532122

我试过:

A = LOAD 'hdfs://localhost:8020/BigData_POC/....../TOY_Detail.txt' USING PigStorage(';') AS (TOYID:chararray,TOYSeries:chararray,ModuleID:chararray,ID:c‌​hararray,DESCRIPTION‌​:chararray,PART_NUMB‌​ER:chararray,SUPPLIE‌​R:chararray,LAND:cha‌​rarray);
B = FOREACH A GENERATE TOYID,ModuleID,DESCRIPTION,PART_NUMBER,SUPPLIER,LAND;
C = GROUP B by (SUPPLIER,LAND);
D = foreach C generate group, COUNT(B) as cnt, B.TOYID,B.ModuleID,B.PART_NUMBER;

我得到这样的输出:

(SUPPLIER,LAND) COUNT {(TOYID) (TOYID) (TOYID)...(TOYID) (MODULEID) (MODULEID) (MODULEID)... (MODULEID)(PARTNUMBER) (PARTNUMBER)... (PARTNUMBER)}

你知道任何可用的 pig 拉丁文字吗?

最佳答案

根据您的评论,您可以试试这个作为解决方案吗?我自己还没有验证过,所以可能还需要一些调整。

D = foreach C generate group, COUNT(B) as cnt; 
E = foreach D generate group.supplier as supplier, group.land as land, cnt;
F = Join B by (supplier,land),E by (supplier,land)

关于hadoop - PIG 拉丁文脚本 - 使用组和 TOBAG,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40017231/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com