gpt4 book ai didi

hadoop - 从 map 函数 hadoop 写入文本输出

转载 作者:可可西里 更新时间:2023-11-01 16:22:06 26 4
gpt4 key购买 nike

输入:

a,b,c,d,e

q,w,34,r,e

1,2,3,4,e

在映射器中,我将获取最后一个字段的所有值,并且我想发出 (e,(a,b,c,d)) 即它发出 (key, (该行的其余字段) ).

感谢帮助。

当前代码:

public static class Map extends Mapper<LongWritable, Text, Text, Text> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString(); // reads the input line by line
String[] attr = line.split(","); // extract each attribute values from the csv record
context.write(attr[argno-1],line); // gives error seems to like only integer? how to override this?
}
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
// further process , loads the chunk into 2d arraylist object for processing
}
public static void main(String[] args) throws Exception {
String line;
String arguements[];
Configuration conf = new Configuration();

// compute the total number of attributes in the file
FileReader infile = new FileReader(args[0]);
BufferedReader bufread = new BufferedReader(infile);
line = bufread.readLine();
arguements = line.split(","); // split the fields separated by comma
conf.setInt("argno", arguements.length); // saving that attribute value
Job job = new Job(conf, "nb");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setMapperClass(Map.class); /* The method setMapperClass(Class<? extends Mapper>) in the type Job is not applicable for the arguments (Class<Map>) */
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}`

请注意错误(见评论)我打脸。

最佳答案

所以这很简单。首先解析您的字符串以获取键并将该行的其余部分作为值传递。然后使用 identity reducer 将所有与列表相同的键值组合在一起作为输出。它应该采用相同的格式。

因此您的 map 函数将输出:

e, (a,b,c,d,e)

e, (q,w,34,r,e)

e, (1,2,3,4,e)

然后在 identity reduce 之后它应该输出:

e, {a,b,c,d,e; q,w,34,r,e; 1,2,3,4,e}

public static class Map extends Mapper<LongWritable, Text, Text, Text> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString(); // reads the input line by line
String[] attr = line.split(","); // extract each attribute values from the csv record
context.write(attr[argno-1],line); // gives error seems to like only integer? how to override this?
}
}

public static void main(String[] args) throws Exception {
String line;
String arguements[];
Configuration conf = new Configuration();

// compute the total number of attributes in the file
FileReader infile = new FileReader(args[0]);
BufferedReader bufread = new BufferedReader(infile);
line = bufread.readLine();
arguements = line.split(","); // split the fields separated by comma
conf.setInt("argno", arguements.length); // saving that attribute value
Job job = new Job(conf, "nb");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setMapperClass(Map.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}

关于hadoop - 从 map 函数 hadoop 写入文本输出,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13389946/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com