gpt4 book ai didi

hadoop - 在应用程序执行时更改dfs.block.size

转载 作者:行者123 更新时间:2023-12-02 21:47:25 25 4
gpt4 key购买 nike

由于dfs.block.size是HDFS设置,因此如果我在应用程序执行期间更改它,就不会有所不同,对吗?
例如,如果作业文件的块大小为128,我调用

hadoop jar /path/to/.jar xxx -D dfs.block.size=256

会有所不同吗?还是需要在将文件保存到HDFS之前更改块大小?
dfs.block.size和任务的分割大小直接相关吗?如果即时消息是正确的,但不是正确的,是否有办法指定分割的大小?

最佳答案

Parameters which decides your split Size for each MR can be set by

mapred.max.split.size & mapred.min.split.size

"mapred.max.split.size" which can be set per job individually through your conf Object. Don't change "dfs.block.size" which affects your HDFS too.Which does change your output block size of execution.

if mapred.min.split.size is less than block size and mapred.max.split.size is greater than block size then 1 block is sent to each map task. The block data is split into key value pairs based on the Input Format you use.

关于hadoop - 在应用程序执行时更改dfs.block.size,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23982422/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com