gpt4 book ai didi

hadoop - MRUnit 不适用于 MultipleOutputs

转载 作者:可可西里 更新时间:2023-11-01 16:31:23 26 4
gpt4 key购买 nike

当我运行带有 MultipleOutputs 的基本 MRUnit 时,出现以下异常:

java.lang.NullPointerException
at org.apache.hadoop.fs.Path.<init>(Path.java:105)
at org.apache.hadoop.fs.Path.<init>(Path.java:94)
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.getDefaultWorkFile(FileOutputFormat.java:264)
at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:125)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:405)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:387)
at com.skobbler.scratch.MOutputReduce.reduce(MOutputReduce.java:45)
at com.skobbler.scratch.MOutputReduce.reduce(MOutputReduce.java:28)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:164)
at org.apache.hadoop.mrunit.mapreduce.ReduceDriver.run(ReduceDriver.java:265)
at org.apache.hadoop.mrunit.mapreduce.ReducePhaseRunner.runReduce(ReducePhaseRunner.java:85)
at org.apache.hadoop.mrunit.mapreduce.MapReduceDriver.run(MapReduceDriver.java:249)
at org.apache.hadoop.mrunit.TestDriver.runTest(TestDriver.java:640)
at org.apache.hadoop.mrunit.TestDriver.runTest(TestDriver.java:627)

我发现请求了ma​​pred.output.dir配置,为null。简单输出不会出现此问题。

MR单元代码:

    @Test
public void testMultiOutput() throws IOException{
MapReduceDriver<LongWritable, Text, Text, Text, Text, Text> mapReduceDriver = createMapReduceDrive();
mapReduceDriver.withInput(new LongWritable(0L), new Text("a,b"));
mapReduceDriver.withInput(new LongWritable(0L), new Text("a,c"));
mapReduceDriver.withMultiOutput("foo", new Text("a"), new Text("2"));
mapReduceDriver.runTest();
}

private MapReduceDriver<LongWritable, Text, Text, Text, Text, Text> createMapReduceDrive() {
MOutputMap mapper = new MOutputMap();
MOutputReduce reducer = new MOutputReduce();
return MapReduceDriver.newMapReduceDriver(mapper, reducer);
}

如何在不指定 hadoop 系统/输出路径的情况下运行测试。

Hadoop 2、MRUnit 1.1.0

最佳答案

是的,我遇到了这个问题。但是我从它的源代码中找到了解决方案。

TestDriver.java

您可以使用 getConfiguration() 方法获取 JobConfiguration 对象,然后设置输出目录。

    Configuration conf = mapReduceDriver.getConfiguration();
conf.set("mapreduce.output.fileoutputformat.outputdir", "aa");

关于hadoop - MRUnit 不适用于 MultipleOutputs,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29961266/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com