gpt4 book ai didi

hadoop - 启动 MapReduce 作业的不同方式

转载 作者:可可西里 更新时间:2023-11-01 14:56:25 25 4
gpt4 key购买 nike

在 Apache Hadoop 中仅使用 job.waitForCompletion(true) 方法和通过 ToolRunner.run(new MyClass(), args) 启动 map reduce 作业有什么区别?

我有一个 MapReduce 作业通过以下两种方式执行:

首先如下:

public class MaxTemperature extends Configured implements Tool {
public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(new MaxTemperature(), args);
System.exit(exitCode);
}

@Override
public int run(String[] args) throws Exception {
if (args.length != 2) {
System.err.println("Usage: MaxTemperature <input path> <output path>");
System.exit(-1);
}
System.out.println("Starting job");
Job job = new Job();
job.setJarByClass(MaxTemperature.class);
job.setJobName("Max temperature");

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
int returnValue = job.waitForCompletion(true) ? 0:1;

if(job.isSuccessful()) {
System.out.println("Job was successful");
} else if(!job.isSuccessful()) {
System.out.println("Job was not successful");
}
return returnValue;
}
}

第二个是:

public class MaxTemperature{

public static void main(String[] args) throws Exception {

if (args.length != 2) {
System.err.println("Usage: MaxTemperature <input path> <output path>");
System.exit(-1);
}
System.out.println("Starting job");
Job job = new Job();
job.setJarByClass(MaxTemperature.class);
job.setJobName("Max temperature");

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
int returnValue = job.waitForCompletion(true) ? 0:1;

if(job.isSuccessful()) {
System.out.println("Job was successful");
} else if(!job.isSuccessful()) {
System.out.println("Job was not successful");

}
}

两种方式的输出是一样的。但是我不明白两者之间有什么区别?哪个比另一个更受欢迎?

最佳答案

这篇文章很好地解释了 ToolRunner 的使用:ToolRunner

关于hadoop - 启动 MapReduce 作业的不同方式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41697229/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com