gpt4 book ai didi

spring - Spring DataFlow Yarn-容器正在超出物理内存运行

转载 作者:行者123 更新时间:2023-12-02 21:05:47 25 4
gpt4 key购买 nike

我在Yarn上运行Spring Cloud Tasks,简单的任务可以正常工作,但是运行较大的任务却需要更多的资源,我收到“容器正在超出物理内存范围”错误:

onContainerCompleted:ContainerStatus: [ContainerId: 
container_1485796744143_0030_01_000002, State: COMPLETE, Diagnostics: Container [pid=27456,containerID=container_1485796744143_0030_01_000002] is running beyond physical memory limits. Current usage: 652.5 MB of 256 MB physical memory used; 5.6 GB of 1.3 GB virtual memory used. Killing container.
Dump of the process-tree for container_1485796744143_0030_01_000002 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 27461 27456 27456 27456 (java) 1215 126 5858455552 166335 /usr/lib/jvm/java-1.8.0/bin/java -Dserver.port=0 -Dspring.jmx.enabled=false -Dspring.config.location=servers.yml -jar cities-job-0.0.1.jar --spring.datasource.driverClassName=org.h2.Driver --spring.datasource.username=sa --spring.cloud.task.name=city2 --spring.datasource.url=jdbc:h2:tcp://localhost:19092/mem:dataflow
|- 27456 27454 27456 27456 (bash) 0 0 115806208 705 /bin/bash -c /usr/lib/jvm/java-1.8.0/bin/java -Dserver.port=0 -Dspring.jmx.enabled=false -Dspring.config.location=servers.yml -jar cities-job-0.0.1.jar --spring.datasource.driverClassName='org.h2.Driver' --spring.datasource.username='sa' --spring.cloud.task.name='city2' --spring.datasource.url='jdbc:h2:tcp://localhost:19092/mem:dataflow' 1>/var/log/hadoop-yarn/containers/application_1485796744143_0030/container_1485796744143_0030_01_000002/Container.stdout 2>/var/log/hadoop-yarn/containers/application_1485796744143_0030/container_1485796744143_0030_01_000002/Container.stderr

我尝试在DataFlow的server.yml设置中调整选项:
spring:
deployer:
yarn:
app:
baseDir: /dataflow
taskappmaster:
memory: 512m
virtualCores: 1
javaOpts: "-Xms512m -Xmx512m"
taskcontainer:
priority: 1
memory: 512m
virtualCores: 1
javaOpts: "-Xms256m -Xmx512m"

我发现taskappmaster内存更改是可见的(YARN中的AM容器设置为该值),但taskcontainer内存选项未更改-创建的Cloud Task的每个容器只有256 mb,这是YarnDeployer的默认选项。

对于此server.yml,预期结果是为Application Master和Application Container分配了2个容器和512个容器。但是YARN为应用程序主节点分配2个容器512,为应用程序分配256 mb。

我不认为此问题与YARN错误的选项有关,因为Spark应用程序可以正确地占用GB的内存。

我的一些YARN设置:
mapreduce.reduce.java.opts -Xmx2304m
mapreduce.reduce.memory.mb 2880
mapreduce.map.java.opts -Xmx3277m
mapreduce.map.memory.mb 4096
yarn.nodemanager.vmem-pmem-ratio 5
yarn.nodemanager.vmem-check-enabled false
yarn.scheduler.minimum-allocation-mb 32
yarn.nodemanager.resource.memory-mb 11520

我的Hadoop运行时是EMR 4.4.0,我也不得不将默认Java更改为1.8。

最佳答案

删除HDFS中的/ dataflow目录可以解决问题,删除该目录后Spring DataFlow上载所有需要的文件。另一种方法是自己删除文件并上传新文件。

关于spring - Spring DataFlow Yarn-容器正在超出物理内存运行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41959711/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com