gpt4 book ai didi

java - spark-submit中spark deploy相关的属性

转载 作者:搜寻专家 更新时间:2023-11-01 03:31:57 25 4
gpt4 key购买 nike

创建基于 spark 的 java 应用程序时,SparkConf 是使用

创建的
sparkConf = new SparkConf().setAppName("SparkTests")
.setMaster("local[*]").set("spark.executor.memory", "2g")
.set("spark.driver.memory", "2g")
.set("spark.driver.maxResultSize", "2g");

但是在文档中here , 它说

Any values specified as flags or in the properties file will be passed on to the application and merged with those specified through SparkConf. Properties set directly on the SparkConf take highest precedence, then flags passed to spark-submit or spark-shell, then options in the spark-defaults.conf file. A few configuration keys have been renamed since earlier versions of Spark; in such cases, the older key names are still accepted, but take lower precedence than any instance of the newer key. Spark properties mainly can be divided into two kinds: one is related to deploy, like “spark.driver.memory”, “spark.executor.instances”, this kind of properties may not be affected when setting programmatically through SparkConf in runtime, or the behavior is depending on which cluster manager and deploy mode you choose, so it would be suggested to set through configuration file or spark-submit command line options; another is mainly related to Spark runtime control, like “spark.task.maxFailures”, this kind of properties can be set in either way.

那么是否有这些部署相关属性的列表,我只能在 spark-submit 中将其作为命令行参数提供?

这里给出了 local[*],但是在运行时我们通过 yarn 集群进行部署。

最佳答案

我也不确定这句话是什么:

在运行时通过 SparkConf 以编程方式设置时,此类属性可能不会受到影响,或者行为取决于您选择的集群管理器和部署模式,因此建议通过配置文件设置或 spark-submit 命令行选项;另一个主要和Spark有关

确切的意思。也许有人可以为我们说清楚。虽然我知道在 YARN 的情况下,优先级如下:

  1. 如果您使用代码设置设置

    SparkSession.builder()
    .config(sparkConf)
    .getOrCreate()

    这将覆盖所有其他设置(命令行,defaults.conf)。这里唯一的异常(exception)是当你修改一个初始化 session 后设置(调用后 session .getOrCreate)。在这种情况下,它将尽可能被忽略想象一下

  2. 如果您不修改代码中的设置,它将回退到命令行设置(spark 会考虑那些在命令行中指定,否则将从 defaults.conf 加载它们)

  3. 最后,如果以上都没有给出,它将从中加载设置默认配置文件

因此,我的最终建议是随意设置代码中的设置,例如“spark.driver.memory”、“spark.executor.instances”

关于java - spark-submit中spark deploy相关的属性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49104868/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com