gpt4 book ai didi

InputFormat 默认值的 Hadoop ClassCastException

转载 作者:可可西里 更新时间:2023-11-01 15:01:37 29 4
gpt4 key购买 nike

我在 Hadoop 上开始使用我的第一个 map-reduce 代码时遇到问题。我从“Hadoop:权威指南”中复制了以下代码,但我无法在我的单节点 Hadoop 安装上运行它。

我的代码片段:

主要:

Job job = new Job(); 
job.setJarByClass(MaxTemperature.class);
job.setJobName("Max temperature");

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

System.exit(job.waitForCompletion(true) ? 0 : 1);

映射器:

public void map(LongWritable key, Text value, Context context)

reducer :

public void reduce(Text key, Iterable<IntWritable> values,
Context context)

map 和 reduce 函数的实现也仅摘自本书。但是当我尝试执行此代码时,这是我得到的错误:

INFO mapred.JobClient: Task Id : attempt_201304021022_0016_m_000000_0, Status : FAILED
java.lang.ClassCastException: interface javax.xml.soap.Text
at java.lang.Class.asSubclass(Class.java:3027)
at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:774)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:959)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

过去对类似问题的回答 ( Hadoop type mismatch in key from map expected value Text received value LongWritable ) 帮助我弄清楚 InputFormatClass 应该匹配映射函数的输入。所以我也尝试使用 job.setInputFormatClass(TextInputFormat.class);在我的主要方法中,但它也没有解决问题。这可能是什么问题?

这里是Mapper类的实现

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class MaxTemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

private static final int MISSING = 9999;

@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {

String line = value.toString();
String year = line.substring(15, 19);

int airTemperature;
if (line.charAt(45) == '+') { // parseInt doesn't like leading plus signs
airTemperature = Integer.parseInt(line.substring(46, 50));
} else {
airTemperature = Integer.parseInt(line.substring(45, 50));
}
String quality = line.substring(50, 51);
if (airTemperature != MISSING && quality.matches("[01459]")) {
context.write(new Text(year), new IntWritable(airTemperature));
}
}

}

最佳答案

您自动导入了错误的导入。您没有导入 org.apache.hadoop.io.Text,而是导入了 import javax.xml.soap.Text

您可以在此 blog 中找到示例错误导入.

关于InputFormat 默认值的 Hadoop ClassCastException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15814266/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com