gpt4 book ai didi

hadoop - 从映射器输出中获取前 N 个项目 - Mapreduce

转载 作者:可可西里 更新时间:2023-11-01 16:30:54 24 4
gpt4 key购买 nike

我的 Mapper 任务返回以下输出:

2   c
2 g
3 a
3 b
6 r

我已经编写了生成正确输出的 reducer 代码和 keycomparator,但是我如何获得 Mapper 输出的前 3 名(按计数排名前 N):

public static class WLReducer2 extends
Reducer<IntWritable, Text, Text, IntWritable> {

@Override
protected void reduce(IntWritable key, Iterable<Text> values,
Context context) throws IOException, InterruptedException {

for (Text x : values) {
context.write(new Text(x), key);
}

};

}

public static class KeyComparator extends WritableComparator {
protected KeyComparator() {
super(IntWritable.class, true);
}

@Override
public int compare(WritableComparable w1, WritableComparable w2) {
// TODO Auto-generated method stub

// Logger.error("--------------------------> writing Keycompare data = ----------->");
IntWritable ip1 = (IntWritable) w1;
IntWritable ip2 = (IntWritable) w2;
int cmp = -1 * ip1.compareTo(ip2);

return cmp;
}
}

这是 reducer 的输出:

r   6
b 3
a 3
g 2
c 2

reducer 的预期输出是计数前 3 的:

r   6
b 3
a 3

最佳答案

限制 reducer 的输出。像这样。

public static class WLReducer2 extends
Reducer<IntWritable, Text, Text, IntWritable> {
int count=0;
@Override
protected void reduce(IntWritable key, Iterable<Text> values,
Context context) throws IOException, InterruptedException {

for (Text x : values) {
if (count > 3)
context.write(new Text(x), key);
count++;
}

};
}

将 reducer 的数量设置为 1。job.setNumReduceTasks(1)

关于hadoop - 从映射器输出中获取前 N 个项目 - Mapreduce,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32791430/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com