gpt4 book ai didi

hadoop - 如何加入 pig 袋

转载 作者:行者123 更新时间:2023-12-02 21:16:50 24 4
gpt4 key购买 nike

首先,我有两个数据文件。

largefile.txt:

1001    {(1,-1),(2,-1),(3,-1),(4,-1)}

smallfile.txt:
1002    {(1,0.04),(2,0.02),(4,0.03)}

我想要这样的smallfile.txt:
1002    {(1,0.04),(2,0.02),(3,-1),(4,0.03)}

我可以执行哪种类型的联接?
A = LOAD './largefile.txt' USING PigStorage('\t') AS (id:int, a:bag{tuple(time:int,value:float)});

B = LOAD './smallfile.txt' USING PigStorage('\t') AS (id:int, b:bag{tuple(time:int,value:float)});

最佳答案

您能稍微清除一下要求吗?是否要以相同的值(例如1002)从largefile.txt和smallfile.txt加入第一列/字段。如果是这样,您可以简单地做到这一点:

A =使用PigStorage('\ t')AS(id:int,a:bag {tuple(time:int,value:float)})来加载'./largefile.txt'。

A = Foreach A生成id,FLATTEN(a)作为time,value;

B =使用PigStorage('\ t')AS(id:int,b:bag {tuple(time:int,value:float)})来加载'./smallfile.txt'。

B = Foreach B生成id,FLATTEN(b)作为time,value;

join =通过A.id加入A,通过B.id加入B;

关于hadoop - 如何加入 pig 袋,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38569062/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com