gpt4 book ai didi

java - 使用 mapreduce 的第二大薪水 - 输出不符合预期

转载 作者:可可西里 更新时间:2023-11-01 15:22:08 29 4
gpt4 key购买 nike

我编写了一个小型 mapreduce 作业来查找数据集中第二高的薪水。我相信第二高的薪水逻辑是正确的。但是我得到的多个输出是不正确的,应该只有一个带有名称的输出,例如 John,9000。而且输出也不正确,我在这里给出数据集和代码

hh,0,Jeet,3000
hk,1,Mayukh,4000
nn,2,Antara,3500
mm,3,Shubu,6000
ii,4,Parsi,8000

输出应该是 Shubu,6000 ,但是我得到的是下面的输出

  Antara    -2147483648
Mayukh -2147483648
Parsi 3500
Shubu 4000

我使用的代码是

 public class SecondHigestMapper extends Mapper<LongWritable,Text,Text,Text>{

private Text salary = new Text();
private Text name = new Text();
public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException{

if(key.get()!=0){
String split[]= value.toString().split(",");
salary.set(split[2]+";"+split[3]);
name.set("ignore");
context.write(name,salary);
}
}
}


public class SecondHigestReducer extends Reducer<Text,Text,Text,IntWritable>{

public void reduce(Text key,Iterable<Text> values,Context context) throws IOException, InterruptedException{
int highest = 0;
int second_highest = 0;
int salary;

for(Text val:values){
String[] fn = val.toString().split("\\;");
salary = Integer.parseInt(fn[3]);

if(highest < salary){
second_highest = highest;
highest =salary;
} else if(second_highest < salary){
second_highest = salary;
}
}
String seconHigest = String.valueOf(second_highest);
context.write(new Text(key),new Text(seconHigest));

}

}

public class SecondHigestDriver {

public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
Job job = new Job(conf,"Second Higest Sal");
job.setJarByClass(SecondHigestDriver.class);
job.setMapperClass(SecondHigestMapper.class);
job.setCombinerClass(SecondHigestReducer.class);
job.setReducerClass(SecondHigestReducer.class);
job.setOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);

}
}

我正在低于异常

  Error: java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.io.IntWritable, received org.apache.hadoop.io.Text
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1074)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:712)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at com.jeet.secondhigest.SecondHigestMapper.map(SecondHigestMapper.java:20)
at com.jeet.secondhigest.SecondHigestMapper.map(SecondHigestMapper.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

请帮我解决这个问题

最佳答案

使用一个键将所有薪水强制放入一个 reducer 中

name.set("ignore");  // Could use a NullWritable 
salary.set(split[2]+";"+split[3])); // change to TextWritable
context.write(name,salary); // need to change the signature of the mapper class

然后在 reducer 中,改变接受文本值的方法,然后将它们分开,转换薪水,然后比较它们

关于java - 使用 mapreduce 的第二大薪水 - 输出不符合预期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51006960/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com