gpt4 book ai didi

hadoop - 如何在 Hadoop 的 Mapper 和 Reducer 中提供子类?

转载 作者:行者123 更新时间:2023-12-02 21:05:08 25 4
gpt4 key购买 nike

我有一个从 super (父)类扩展的子(子)类。我想要一种为 Mapper 的输入值提供通用类型的方法,这样我就可以将子级和父级都提供为有效值,如下所示:

公共(public)静态类 MyMapper 扩展 Mapper<..., MyParentClass , ..., ...>

我希望从 MyParentClass 扩展的 MyChildClass 也有效。

但是,当我运行程序时,如果该值是子类,则会出现异常:

映射中的值类型不匹配:预期 MyParentClass,收到 MyChildClass

如何使子类和父类都成为有效 输入/输出 映射器的值(value)?

更新:

package hipi.examples.dumphib;

import hipi.image.FloatImage;
import hipi.image.ImageHeader;
import hipi.imagebundle.mapreduce.ImageBundleInputFormat;
import hipi.util.ByteUtils;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

import java.io.IOException;
import java.util.Iterator;

public class DumpHib extends Configured implements Tool {

public static class DumpHibMapper extends Mapper<ImageHeader, FloatImage, IntWritable, Text> {

@Override
public void map(ImageHeader key, FloatImage value, Context context) throws IOException, InterruptedException {

int imageWidth = value.getWidth();
int imageHeight = value.getHeight();

String outputStr = null;

if (key == null) {
outputStr = "Failed to read image header.";
} else if (value == null) {
outputStr = "Failed to decode image data.";
} else {
String camera = key.getEXIFInformation("Model");
String hexHash = ByteUtils.asHex(ByteUtils.FloatArraytoByteArray(value.getData()));
outputStr = imageWidth + "x" + imageHeight + "\t(" + hexHash + ")\t " + camera;
}

context.write(new IntWritable(1), new Text(outputStr));
}

}

public static class DumpHibReducer extends Reducer<IntWritable, Text, IntWritable, Text> {

@Override
public void reduce(IntWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
for (Text value : values) {
context.write(key, value);
}
}

}

public int run(String[] args) throws Exception {

if (args.length < 2) {
System.out.println("Usage: dumphib <input HIB> <output directory>");
System.exit(0);
}

Configuration conf = new Configuration();

Job job = Job.getInstance(conf, "dumphib");

job.setJarByClass(DumpHib.class);
job.setMapperClass(DumpHibMapper.class);
job.setReducerClass(DumpHibReducer.class);

job.setInputFormatClass(ImageBundleInputFormat.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);

String inputPath = args[0];
String outputPath = args[1];

removeDir(outputPath, conf);

FileInputFormat.setInputPaths(job, new Path(inputPath));
FileOutputFormat.setOutputPath(job, new Path(outputPath));

job.setNumReduceTasks(1);

return job.waitForCompletion(true) ? 0 : 1;

}

private static void removeDir(String path, Configuration conf) throws IOException {
Path output_path = new Path(path);
FileSystem fs = FileSystem.get(conf);
if (fs.exists(output_path)) {
fs.delete(output_path, true);
}
}

public static void main(String[] args) throws Exception {
int res = ToolRunner.run(new DumpHib(), args);
System.exit(res);
}

}

float 图像 是一个 super 类,我有 ChildFloatImage 从它扩展的类。当 ChildFloatImage 从 返回时记录阅读器 它抛出了先前的异常。

最佳答案

我遵循的解决方案是创建一个容器/包装类,将所有必需的功能委托(delegate)给原始对象,如下所示:

public class FloatImageContainer implements Writable, RawComparator<BinaryComparable> {

private FloatImage floatImage;

public FloatImage getFloatImage() {
return floatImage;
}

public void setFloatImage(FloatImage floatImage) {
this.floatImage = floatImage;
}

public FloatImageContainer() {
this.floatImage = new FloatImage();
}

public FloatImageContainer(FloatImage floatImage) {
this.floatImage = floatImage;
}

@Override
public int compare(BinaryComparable o1, BinaryComparable o2) {
// TODO Auto-generated method stub
return floatImage.compare(o1, o2);
}

@Override
public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
// TODO Auto-generated method stub
return floatImage.compare(b1, s1, l1, b2, s2, l2);
}

@Override
public void write(DataOutput out) throws IOException {
// TODO Auto-generated method stub
floatImage.write(out);
}

@Override
public void readFields(DataInput in) throws IOException {
// TODO Auto-generated method stub
floatImage.readFields(in);
}

}

在映射器中:
public static class MyMapper extends Mapper<..., FloatImageContainer, ..., ...> {

在这种情况下, float 图像 ChildFloatImage 可以封装在 FloatImageContainer 并且您摆脱了 Hadoop 中的继承问题,因为只有一个类直接使用 FloatImageContainer 这不是任何的 parent / child 。

关于hadoop - 如何在 Hadoop 的 Mapper 和 Reducer 中提供子类?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42165885/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com