gpt4 book ai didi

hadoop - 根据条件停止 Hadoop 中的 Reduce 函数

转载 作者:可可西里 更新时间:2023-11-01 14:25:07 25 4
gpt4 key购买 nike

我有一个 reduce 函数,我想在处理一些 'n' 个键后停止 reduce 函数。我已经设置了一个计数器以在每个键上递增,并在满足条件的情况下从 reduce 函数返回。

这是代码

    public class wordcount {

public static class Map extends Mapper<LongWritable, Text, IntWritable, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
private IntWritable leng=new IntWritable();

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();

StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
String lword=tokenizer.nextToken();
leng.set(lword.length());
context.write(leng, one);
}
}
}

public static class Reduce extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {

int count=0;
public void reduce(IntWritable key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
count++;
}
context.write(key, new IntWritable(sum));
if(count>19) return;
}

}

有没有其他方法可以实现这一目标。

最佳答案

您可以通过覆盖 Reducer 类(新 API)的 run() 来实现此目的

public static class Reduce extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {

//reduce method here

// Override the run()
@override
public void run(Context context) throws IOException, InterruptedException {
setup(context);
int count = 0;
while (context.nextKey()) {
if (count++ < n) {
reduce(context.getCurrentKey(), context.getValues(), context);
} else {
// exit or do whatever you want
}
}
cleanup(context);
}
}

关于hadoop - 根据条件停止 Hadoop 中的 Reduce 函数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15544205/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com