gpt4 book ai didi

java - Map-Reduce 程序 : Mapper not behaving as expected

转载 作者:可可西里 更新时间:2023-11-01 16:52:24 28 4
gpt4 key购买 nike

friend ,

我是 Map-Reduce 的新手,正在尝试一个只执行 Mapper 的例子;但输出很奇怪,出乎意料。如果我在这里遗漏了什么,请帮助我查找:

代码部分:

进口:

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

司机计划

Job job = new Job(conf,"SampleProgram");
job.setJarByClass(SampleMR.class); // class that contains mapper and reducer
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class); // reducer class

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setNumReduceTasks(0);
FileInputFormat.setInputPaths(job, new Path("/tmp/"));
FileOutputFormat.setOutputPath(job, new Path("/tmp/out")); // adjust directories as required

job.submit();

boolean b = job.waitForCompletion(true);
if (!b) {
throw new IOException("error with job!");
}

映射程序

public static class MyMapper extends Mapper<LongWritable, Text, Text, Text>  {
@Override
public void map(LongWritable idx , Text value, Context context) throws IOException, InterruptedException {
String[] tokens = value.toString().split("|");
String keyPrefix = tokens[0] + tokens[1];
context.write(new Text(keyPrefix), value);
}
}

还有一个 reducer 阶段,但我已将 reducer 设置为 0 以调试该问题。这里映射器的行为不正确。

对于输入

379782759851005|ABCDEFG|name:YOLO|top:44.7|avgtop:19.2

预期的 map 输出是

379782759851005ABCDEFG [Blank Space] 379782759851005|ABCDEFG|name:YOLO|top:44.7|avgtop:19.2

输出我的映射器

3 [Blank Space] 379782759851005|ABCDEFG|name:YOLO|top:44.7|avgtop:19.2

看起来,Key 只打印了预期输出的第一个字母。如果我尝试将 tokens[4] 作为值添加到上下文中,值也会发生同样的情况。看起来在拆分字符串时发生了一些事情。任何见解,可能出了什么问题?

最佳答案

您需要对管道字符进行转义。请参阅以下链接:

Splitting string with pipe character ("|")

关于java - Map-Reduce 程序 : Mapper not behaving as expected,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31870857/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com