gpt4 book ai didi

java - Hadoop:Reduce 没有产生所需的输出,它与 map 输出相同

转载 作者:行者123 更新时间:2023-12-01 11:00:12 25 4
gpt4 key购买 nike

这是我的 map

 public static class MapClass extends Mapper<LongWritable, Text, Text, Text> {

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{
String[] fields = value.toString().split(",", -20);
String country = fields[4];
String numClaims = fields[8];
if (numClaims.length() > 0 && !numClaims.startsWith("\"")) {
context.write(new Text(country), new Text(numClaims + ",1"));
}
}
}

这是我的Reduce

public void reduce(Text key, Iterator<Text> values, Context context) throws IOException, InterruptedException {
double sum = 0.0;
int count = 0;

while (values.hasNext()) {
String[] fields = values.next().toString().split(",");
sum += Double.parseDouble(fields[0]);
count += Integer.parseInt(fields[1]);
}

context.write(new Text(key), new DoubleWritable(sum/count));
}

这是它的配置方式

Job job = new Job(getConf());

job.setJarByClass(AverageByAttributeUsingCombiner.class);
job.setJobName("AverageByAttributeUsingCombiner");

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);

job.setMapperClass(MapClass.class);
// job.setCombinerClass(Combinber.class);
job.setReducerClass(Reduce.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

// job.setNumReduceTasks(0); // to not run the reducer
boolean success = job.waitForCompletion(true);
return success ? 0 : 1;

输入的形式是

   "PATENT","GYEAR","GDATE","APPYEAR","COUNTRY","POSTATE","ASSIGNEE","ASSCODE","CLAIMS","NCLASS","CAT","SUBCAT","CMADE","CRECEIVE","RATIOCIT","GENERAL","ORIGINAL","FWDAPLAG","BCKGTLAG","SELFCTUB","SELFCTLB","SECDUPBD│                                                                                                                                                                                                                
","SECDLWBD" │
3070801,1963,1096,,"BE","",,1,,269,6,69,,1,,0,,,,,,, │
3070802,1963,1096,,"US","TX",,1,,2,6,63,,0,,,,,,,,, │
3070803,1963,1096,,"US","IL",,1,,2,6,63,,9,,0.3704,,,,,,, │
3070804,1963,1096,,"US","OH",,1,,2,6,63,,3,,0.6667,,,,,,,

整个map reduce的输出看起来像

“AR”5,1 │
"AR"9,1 │
"AR"2,1 │
"AR"15,1 │
"AR"13,1 │
"AR"1,1 │
"AR"34,1 │
"AR"12,1 │
"AR"8,1 │
"AR"7,1 │
"AR"23,1 │
"AR"3,1 │
"AR"4,1 │
“AR”4,1

我该如何调试和解决这个问题?我正在学习hadoop

最佳答案

如前所述,问题在于您没有覆盖默认抽象 Reducer 类的默认 reduce 方法。

更具体地说,到目前为止的(一个/那个)问题是您的 reduce 方法签名是:

 public void reduce(Text key, **Iterator**<Text> values, Context context) 
throws IOException, InterruptedException

相反,它应该是:

 public void reduce(Text key, **Iterable**<Text> values, Context context) 
throws IOException, InterruptedException

旧的 API 版本是正确的,您实现 Reducer 接口(interface) reduce() 方法并且它可以工作。

对这种情况的一个很好的验证是使用 @Override,因为它强制编译时检查签名不匹配。

关于java - Hadoop:Reduce 没有产生所需的输出,它与 map 输出相同,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11748906/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com