gpt4 book ai didi

java - 为什么推测执行对 Giraph 没有意义?

转载 作者:可可西里 更新时间:2023-11-01 14:53:43 25 4
gpt4 key购买 nike

最近我正在运行一些基准测试来了解 Giraph 中的故障转移机制。

其实我很好奇;当工作中的一个 worker 变慢时,其他 worker 将等待它。后来在GiraphJob.java中发现了这样的东西:

// Speculative execution doesn't make sense for Giraph
giraphConfiguration.setBoolean("mapred.map.tasks.speculative.execution", false);

有谁知道为什么 Giraph 中没有启用推测执行?

谢谢

最佳答案

首先让我们回顾一下什么是推测执行。引自 Yahoo's Hadoop tutorial :

Speculative execution: One problem with the Hadoop system is that by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program. For example if one node has a slow disk controller, then it may be reading its input at only 10% the speed of all the other nodes. So when 99 map tasks are already complete, the system is still waiting for the final map task to check in, which takes much longer than all the other nodes. By forcing tasks to run in isolation from one another, individual tasks do not know where their inputs come from. Tasks trust the Hadoop platform to just deliver the appropriate input. Therefore, the same input can be processed multiple times in parallel, to exploit differences in machine capabilities. As most of the tasks in a job are coming to a close, the Hadoop platform will schedule redundant copies of the remaining tasks across several nodes which do not have other work to perform. This process is known as speculative execution. When tasks complete, they announce this fact to the JobTracker. Whichever copy of a task finishes first becomes the definitive copy. If other copies were executing speculatively, Hadoop tells the TaskTrackers to abandon the tasks and discard their outputs. The Reducers then receive their inputs from whichever Mapper completed successfully, first. Speculative execution is enabled by default. You can disable speculative execution for the mappers and reducers by setting the mapred.map.tasks.speculative.execution and mapred.reduce.tasks.speculative.execution JobConf options to false, respectively

如果我对 Giraph 的理解是正确的,他们不会使用推测执行,因为他们使用自己的迭代计算范式,但它不适合。这种范式的灵感来自 google 的 pregel,它提供了更多的图形以节点为中心的数据 View 。此外,容错是通过检查点创建的,这意味着每次迭代(也称为超步)计算每个图形节点的所有传入消息,然后消息在节点之间分发。

简单地说,MapReduce 并未以其原始方式使用,因此 giraph 的推测执行没有意义。

关于java - 为什么推测执行对 Giraph 没有意义?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26583340/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com