gpt4 book ai didi

hadoop - apache pig中一列的最大值

转载 作者:可可西里 更新时间:2023-11-01 16:36:29 26 4
gpt4 key购买 nike

我正在尝试使用 pig 查找列 ratingTime 的最大值。我正在运行以下脚本:

    ratings = LOAD '/user/maria_dev/ml-100k/u.data' AS (userid:int,movieID:int,rating:int, ratingTime:int);
maxrating = MAX(ratings.ratingTime);
DUMP maxrating

示例输入数据是:

    196 242 3   881250949
186 302 3 891717742
22 377 1 878887116
244 51 2 880606923

我遇到以下错误:

     2018-08-05 07:02:05,247 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook 

2018-08-05 07:02:05,914 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. <file script.pi

最佳答案

在应用 MAX 之前,您需要一个前置的 GROUP ALLSource

ratings = LOAD '/user/maria_dev/ml-100k/u.data' USING PigStorage('\t') AS (userid:int,movieID:int,rating:int, ratingTime:int);
rating_group = GROUP ratings ALL;
maxrating = FOREACH ratings_group GENERATE MAX(ratings.ratingTime);
DUMP maxrating;

关于hadoop - apache pig中一列的最大值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51692298/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com