- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我在 yarn 上运行 Spark 。我不明白以下设置有什么区别spark.yarn.executor.memoryOverhead
和 spark.memory.offHeap.size
.两者似乎都是分配堆外内存以触发执行程序的设置。我应该使用哪一种?另外,执行程序堆外内存的推荐设置是什么?
非常感谢!
最佳答案
spark.executor.memoryOverhead
由 YARN 等资源管理使用,而 spark.memory.offHeap.size
由 Spark 核心(内存管理器)使用。关系因版本而有所不同。
Spark 2.4.5 及之前版本:spark.executor.memoryOverhead
应该包括 spark.memory.offHeap.size
.这意味着如果您指定 offHeap.size
,您需要手动将此部分添加到 memoryOverhead
yarn 。正如您从以下来自 YarnAllocator.scala 的代码中看到的, 当 YARN 请求资源时,它对 offHeap.size
一无所知:
private[yarn] val resource = Resource.newInstance(
executorMemory + memoryOverhead + pysparkWorkerMemory,
executorCores)
spark.executor.memoryOverhead
不包括
spark.memory.offHeap.size
了。 YARN 将包括
offHeap.size
为您请求资源时。来自新
documentation :
Note: Additional memory includes PySpark executor memory (when spark.executor.pyspark.memory is not configured) and memory used by other non-executor processes running in the same container. The maximum memory size of container to running executor is determined by the sum of spark.executor.memoryOverhead, spark.executor.memory, spark.memory.offHeap.size and spark.executor.pyspark.memory.
private[yarn] val resource: Resource = {
val resource = Resource.newInstance(
executorMemory + executorOffHeapMemory + memoryOverhead + pysparkWorkerMemory, executorCores)
ResourceRequestHelper.setResourceRequests(executorResourceRequests, resource)
logDebug(s"Created resource capability: $resource")
resource
}
Off-heap memory is a great way to reduce GC pauses because it's not in the GC's scope. However, it brings an overhead of serialization and deserialization. The latter in its turn makes that the off-heap data can be sometimes put onto heap memory and hence be exposed to GC. Also, the new data format brought by Project Tungsten (array of bytes) helps to reduce the GC overhead. These 2 reasons make that the use of off-heap memory in Apache Spark applications should be carefully planned and, especially, tested.
spark.yarn.executor.memoryOverhead
已弃用并更改为
spark.executor.memoryOverhead
,这在 YARN 和 Kubernetes 中很常见。
关于apache-spark - "spark.yarn.executor.memoryOverhead"和 "spark.memory.offHeap.size"的区别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58666517/
当我在 Spark configuration 中搜索堆外时,有两个相关的属性(spark.executor.memoryOverhead 和spark.memory.offHeap.size),我不
我在 yarn 上运行 Spark 。我不明白以下设置有什么区别spark.yarn.executor.memoryOverhead和 spark.memory.offHeap.size .两者似乎都
我是一名优秀的程序员,十分优秀!