gpt4 book ai didi

hadoop - MapReduce什么时候调用Exactly Combiner?

转载 作者:可可西里 更新时间:2023-11-01 16:31:09 24 4
gpt4 key购买 nike

Combiners 是使用与 reducer 相同的类和大部分相同的代码制作的。但是问题是什么时候在 sort 和 shuffle 之前或 reduce 之前调用它?如果在排序和洗牌之前 i.即,在 mapper 之后它将如何获得输入 [key, list<values>] ?因为这是由排序和随机播放给出的。现在,如果它在 sort and shuffle i 之后被调用。即,就在 reducer 之前,然后输出到组合器是 [key, value]像 reducer 那么 reducer 如何将输入作为 [key, list<values>]

最佳答案

组合器的输出类型必须与映射器的输出类型匹配。 Hadoop 不保证组合器被应用了多少次,甚至根本不保证它被应用。

如果您的映射器扩展了 Mapper< K1, V1, K2, V2 >并且你的 reducer 扩展了
Reducer< K2, V2, K3, V3 > ,则组合器必须是
Reducer< K2, V2, K2, V2 > 的扩展.

Combinermap 在同一台机器上应用手术。绝对在洗牌之前。

如 Hadoop 文档所述:

When the map operation outputs its pairs they are already available in memory. For efficiency reasons, sometimes it makes sense to take advantage of this fact by supplying a combiner class to perform a reduce-type function. If a combiner is used then the map key-value pairs are not immediately written to the output. Instead they will be collected in lists, one list per each key value. When a certain number of key-value pairs have been written, this buffer is flushed by passing all the values of each key to the combiner's reduce method and outputting the key-value pairs of the combine operation as if they were created by the original map operation.

http://wiki.apache.org/hadoop/HadoopMapReduce

关于hadoop - MapReduce什么时候调用Exactly Combiner?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31260648/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com