gpt4 book ai didi

java - Hadoop:错误:java.io.IOException:映射中的键类型不匹配:预期的 org.apache.hadoop.io.Text,收到 org.apache.hadoop.io.LongWritable

转载 作者:可可西里 更新时间:2023-11-01 16:10:44 25 4
gpt4 key购买 nike

我正在尝试编写一个用于倒排索引计算的 map reduce 程序。

我的 map 代码是

public class InvertdIdxMapper extends Mapper<LongWritable, Text, Text, Text> {

public void map(LongWritable ikey, Text ivalue, Context context,Reporter reporter)
throws IOException, InterruptedException {

Text word=new Text();
Text location=new Text();

FileSplit filespilt=(FileSplit)reporter.getInputSplit();
String fileName=filespilt.getPath().getName();
location.set(fileName);

String line=ivalue.toString();
StringTokenizer itr=new StringTokenizer(line.toLowerCase());
while (itr.hasMoreTokens()){
word.set(itr.nextToken());
//System.out.println("Key is "+ word + "value is "+location);
context.write(word, location);
}
}
}

我的reducer代码是

public class InvertedIdxReducer extends Reducer<Text, Text, Text, Text> {

public void reduce(Text _key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {


boolean first=true;
StringBuilder toReturn=new StringBuilder();
// process valuess
Iterator<Text> itr =values.iterator();
while(itr.hasNext()){
if(!first)
toReturn.append(", ");
first=false;
toReturn.append(itr.next().toString());

}
context.write(_key,new Text(toReturn.toString()));
}
}

驱动程序代码是

public class InvertedIdxDriver {

public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "JobName");
job.setJarByClass(InvertedIdxDriver.class);
// TODO: specify a mapper
job.setMapperClass(InvertdIdxMapper.class);
// TODO: specify a reducer
job.setReducerClass(InvertedIdxReducer.class);

// TODO: specify output types
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
/////
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);

// TODO: specify input and output DIRECTORIES (not files)
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

if (!job.waitForCompletion(true))
return;
}

}

当我运行上面的代码时,出现以下错误

 15/08/18 13:27:04 INFO mapreduce.Job: Task Id : attempt_1439870445298_0019_m_000000_2, Status : FAILED
Error: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.LongWritable
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1069)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:712)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

这个程序的输入是只有几行的简单文本文件。我关注了thisthis发布但我的问题仍然存在。我是否遗漏了 map-reduce 编程的一些重要注意事项?

请建议..

谢谢

最佳答案

我认为您没有正确覆盖 map 方法,因此调用了默认的 map 方法,这就是您收到错误的原因。检查您的 map 方法的签名是否正确。我相信它应该是这样的:

protected void map(LongWritable iKey, Text iValue, Context context) throws IOException, InterruptedException

你还需要替​​换这一行:

FileSplit filespilt=(FileSplit)reporter.getInputSplit();

与:

FileSplit filespilt=(FileSplit)context.getInputSplit();

关于java - Hadoop:错误:java.io.IOException:映射中的键类型不匹配:预期的 org.apache.hadoop.io.Text,收到 org.apache.hadoop.io.LongWritable,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32066939/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com