gpt4 book ai didi

hadoop - 如何在两个 map reduce 作业之间传递变量

转载 作者:可可西里 更新时间:2023-11-01 14:26:14 26 4
gpt4 key购买 nike

我链接了两个 Map reduce 作业。 Job1 将只有一个 reducer ,我正在计算一个浮点值。我想在 Job2 的 reducer 中使用这个值。这是我的主要方法设置。

public static String GlobalVriable;
public static void main(String[] args) throws Exception {

int runs = 0;
for (; runs < 10; runs++) {
String inputPath = "part-r-000" + nf.format(runs);
String outputPath = "part-r-000" + nf.format(runs + 1);
MyProgram.MR1(inputPath);
MyProgram.MR2(inputPath, outputPath);
}
}

public static void MR1(String inputPath)
throws IOException, InterruptedException, ClassNotFoundException {

Configuration conf = new Configuration();
conf.set("var1","");
Job job = new Job(conf, "This is job1");
job.setJarByClass(MyProgram.class);
job.setMapperClass(MyMapper1.class);
job.setReducerClass(MyReduce1.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(FloatWritable.class);
FileInputFormat.addInputPath(job, new Path(inputPath));
job.waitForCompletion(true);
GlobalVriable = conf.get("var1"); // I am getting NULL here
}

public static void MR2(String inputPath, String outputPath)
throws IOException, InterruptedException, ClassNotFoundException {

Configuration conf = new Configuration();
Job job = new Job(conf, "This is job2");
...
}

public static class MyReduce1 extends
Reducer<Text, FloatWritable, Text, FloatWritable> {

public void reduce(Text key, Iterable<FloatWritable> values, Context context)
throws IOException, InterruptedException {

float s = 0;
for (FloatWritable val : values) {
s += val.get();
}

String sum = Float.toString(s);
context.getConfiguration().set("var1", sum);
}
}

如您所见,我需要多次迭代整个程序。我的 Job1 正在计算输入的单个数字。因为它只是一个数字和很多迭代,所以我不想将它写入 HDFS 并从中读取。有没有办法共享在 Myreducer1 中计算的值并在 Myreducer2 中使用它。

更新:我尝试使用 conf.set 和 conf.get 传递值。该值未被传递。

最佳答案

下面是如何通过计数器传回浮点值...

首先,在第一个 reducer 中,将 float 值乘以 1000(例如,以保持 3 位精度)并将结果放入计数器中,从而将 float 值转换为 long:

public void cleanup(Context context) {

long result = (long) (floatValue * 1000);
context.getCounter("Result","Result").increment(result);

}

在驱动程序类中,检索 long 值并将其转换回 float:

public static void MR1(String inputPath)
throws IOException, InterruptedException, ClassNotFoundException {

Configuration conf = new Configuration();
Job job = new Job(conf, "This is job1");
job.setJarByClass(MyProgram.class);
job.setMapperClass(MyMapper1.class);
job.setReducerClass(MyReduce1.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(FloatWritable.class);
FileInputFormat.addInputPath(job, new Path(inputPath));
job.waitForCompletion(true);

long result = job.getCounters().findCounter("Result","Result").getValue();
float value = ((float)result) / 1000;

}

关于hadoop - 如何在两个 map reduce 作业之间传递变量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13654965/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com