gpt4 book ai didi

java - HADOOP - 1.2.1 稳定版的字数统计示例

转载 作者:可可西里 更新时间:2023-11-01 16:23:57 26 4
gpt4 key购买 nike

我正在研究 hadoop 1.2.1 的字数统计示例。但一定有什么地方发生了变化,因为我似乎无法让它发挥作用。

这是我的 Reduce 类:

public static class Reduce extends Reducer<WritableComparable, Writable, WritableComparable, Writable> {

public void reduce(WritableComparable key,
Iterator<Writable> values,
OutputCollector<WritableComparable, NullWritable> output,
Reporter reporter) throws IOException {

output.collect(key, NullWritable.get());

}

}

还有我的主要功能:

public static void main(String[] args) throws Exception {

JobConf jobConf = new JobConf(MapDemo.class);

jobConf.setNumMapTasks(10);
jobConf.setNumReduceTasks(1);

jobConf.setJobName("MapDemo");

jobConf.setOutputKeyClass(Text.class);
jobConf.setOutputValueClass(NullWritable.class);

jobConf.setMapperClass(Map.class);
jobConf.setReducerClass(Reduce.class);

jobConf.setInputFormat(TextInputFormat.class);
jobConf.setOutputFormat(TextOutputFormat.class);

FileInputFormat.setInputPaths(jobConf, new Path(args[0]));
FileOutputFormat.setOutputPath(jobConf, new Path(args[1]));

JobClient.runJob(jobConf);
}

我的 IDE 告诉我有一个错误,Maven 证实了这一点:

[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] com/example/mapreduce/MapDemo.java:[71,16] method setReducerClass in class org.apache.hadoop.mapred.JobConf cannot be applied to given types;
required: java.lang.Class<? extends org.apache.hadoop.mapred.Reducer>
found: java.lang.Class<com.example.mapreduce.MapDemo.Reduce>
reason: actual argument java.lang.Class<com.example.mapreduce.MapDemo.Reduce> cannot be converted to java.lang.Class<? extends org.apache.hadoop.mapred.Reducer> by method invocation conversion
[INFO] 1 error
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.679s
[INFO] Finished at: Mon Sep 16 09:23:08 PDT 2013
[INFO] Final Memory: 17M/202M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on project inventory: Compilation failure
[ERROR] com/example/mapreduce/MapDemo.java:[71,16] method setReducerClass in class org.apache.hadoop.mapred.JobConf cannot be applied to given types;
[ERROR] required: java.lang.Class<? extends org.apache.hadoop.mapred.Reducer>
[ERROR] found: java.lang.Class<com.example.mapreduce.MapDemo.Reduce>

我认为在线字数统计示例对于 1.2.1 已经过时。我该如何解决?有没有人有指向有效的 1.2.1 字数统计 java 源的链接?

最佳答案

您点击了哪个链接?我从未见过这种厕所。但是,无论您遵循什么,都肯定已经过时,因为它使用的是旧 API。我怀疑你是否正确地遵循了它。

这应该有效:

public class WordCount {
/**
* The map class of WordCount.
*/
public static class TokenCounterMapper extends
Mapper<Object, Text, Text, IntWritable> {

private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {

StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}

/**
* The reducer class of WordCount
*/
public static class TokenCounterReducer extends
Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
context.write(key, new IntWritable(sum));
}
}

/**
* The main entry point.
*/
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.addResource(new Path("/Users/miqbal1/hadoop-eco/hadoop-1.1.2/conf/core-site.xml"));
conf.addResource(new Path("/Users/miqbal1/hadoop-eco/hadoop-1.1.2/conf/hdfs-site.xml"));
conf.set("fs.default.name", "hdfs://localhost:9000");
conf.set("mapred.job.tracker", "localhost:9001");
Job job = new Job(conf, "WordCount");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenCounterMapper.class);
job.setReducerClass(TokenCounterReducer.class);
job.setNumReduceTasks(2);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path("/inputs/demo.txt"));
FileOutputFormat.setOutputPath(job, new Path("/outputs/1111223"));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

一些观察:

  • 您没有发出任何计数,因为我可以看到 NullWritable 从您的 Reducer 发出。它只会发出 key 而不进行任何计数。
  • 为您的输入和输出键/值使用正确的类型
  • 使用新 API。它更干净、更好。

关于java - HADOOP - 1.2.1 稳定版的字数统计示例,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18832949/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com