gpt4 book ai didi

hadoop - EMR hadoop (MRv2) 集群的最大容量为 80%。如何获得剩余的20%?

转载 作者:可可西里 更新时间:2023-11-01 14:49:39 26 4
gpt4 key购买 nike

我在 AWS 上使用 Elastic MapReduce(Hadoop 2.0 和 YARN)。

配置如下:

10 x g2.2xlarge core instances with 15GB of RAM and 8 CPU cores
yarn.nodemanager.vmem-check-enabled=false
yarn.scheduler.minimum-allocation-mb=2048
yarn.nodemanager.resource.memory-mb=12288
mapreduce.map.memory.mb=3072

运行作业时,调度程序显示仅分配了 81.7% 的集群:

Used Capacity:  81.7%Absolute Used Capacity: 81.7%Absolute Capacity:  100.0%Absolute Max Capacity:  100.0%Used Resources: Num Schedulable Applications:   1Num Non-Schedulable Applications:   0Num Containers:  25Max Applications:   10000Max Applications Per User:  10000Max Schedulable Applications:   6Max Schedulable Applications Per User:  6Configured Capacity:    100.0%Configured Max Capacity:    100.0%Configured Minimum User Limit Percent:  100%Configured User Limit Factor:   1.0Active users:   hadoop 

The scheduler assigns max 3 containers per node and the total number of containers is capped at 25.

Why does it only allocate 25 containers?

From the memory settings I would expect to see

yarn.nodemanager.resource.memory-mb(12288) / mapreduce.map.memory.mb(3072) = 4 containers per node

谢谢

附言这看起来像一个类似的问题,但没有回答 How concurrent # mappers and # reducers are calculated in Hadoop 2 + YARN?

最佳答案

我在通过 this tutorial 后开始工作了.

改变了两件事:

  1. mapreduce.map.memory.mb 有错字
  2. mapreduce.map.java.opts 默认设置过低

对我有用的最终设置是:

yarn.nodemanager.vmem-pmem-ratio=50
yarn.nodemanager.resource.memory-mb=12288
yarn.scheduler.minimum-allocation-mb=3057
yarn.app.mapreduce.am.resource.mb=6114
mapreduce.map.java.opts: -Xmx2751m
mapreduce.map.memory.mb: 3057

现在它为每个节点完全分配了 4 个容器。

关于hadoop - EMR hadoop (MRv2) 集群的最大容量为 80%。如何获得剩余的20%?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27829829/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com