gpt4 book ai didi

apache-pig - pig -如何在JOIN之后引用FOREACH中的列?

转载 作者:行者123 更新时间:2023-12-03 12:04:42 26 4
gpt4 key购买 nike

A = load 'a.txt' as (id, a1);
B = load 'b.txt as (id, b1);
C = join A by id, B by id;
D = foreach C generate id,a1,b1;
dump D;

第四行失败: Invalid field projection. Projected field [id] does not exist in schema
我尝试更改为A.id,但是最后一行失败: ERROR 0: Scalar has more than one row in the output.

最佳答案

您正在寻找的是"Disambiguate Operator"。您想要的是A::id,而不是A.id
A.id说:“存在一个关系/袋 A,并且在其模式中有一个称为id的列”
A::id说:“存在来自A记录,并且其中一列名为id

因此,您将执行以下操作:

A = load 'a.txt' as (id, a1);
B = load 'b.txt as (id, b1);
C = join A by id, B by id;
D = foreach C generate A::id,a1,b1;
dump D;

一个肮脏的选择:

只是因为我很懒,当您开始一个接一个地进行多个联接时,歧义消除变得很奇怪:使用唯一的标识符。
A = load 'a.txt' as (ida, a1);
B = load 'b.txt as (idb, b1);
C = join A by ida, B by idb;
D = foreach C generate ida,a1,b1;
dump D;

关于apache-pig - pig -如何在JOIN之后引用FOREACH中的列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8051180/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com