gpt4 book ai didi

apache-spark - 什么是以及如何在Web UI的“执行程序”选项卡中控制内存存储?

转载 作者:行者123 更新时间:2023-12-04 22:28:50 27 4
gpt4 key购买 nike

我将Spark 1.5.2用于Spark Streaming应用程序。

Web UI的“执行者”选项卡中的存储内存是什么?如何达到530 MB?如何更改该值?

enter image description here

最佳答案

小心:您使用的非常,非常老旧且当前不受支持的Spark 1.5.2(在发布答案后便注意到了这一点),而我的答案是关于Spark 1.6+。

“存储内存”的工具提示可能会说明所有这些内容:

Memory used / total available memory for storage of data like RDD partitions cached in memory.


Storage Memory in Executors tab in web UI
它是 SPARK-10000: Consolidate storage and execution memory management中引入的 统一内存管理功能的一部分,该功能(引用逐字记录):

Memory management in Spark is currently broken down into two disjoint regions: one for execution and one for storage. The sizes of these regions are statically configured and fixed for the duration of the application.

There are several limitations to this approach. It requires user expertise to avoid unnecessary spilling, and there are no sensible defaults that will work for all workloads. As a Spark user, I want Spark to manage the memory more intelligently so I do not need to worry about how to statically partition the execution (shuffle) memory fraction and cache memory fraction. More importantly, applications that do not use caching use only a small fraction of the heap space, resulting in suboptimal performance.

Instead, we should unify these two regions and let one borrow from another if possible.


Spark 特性
您可以使用 spark.driver.memoryspark.executor.memory Spark属性控制存储内存,这些属性设置了Spark应用程序(驱动程序和执行程序)的整个内存空间,并在 spark.memory.fractionspark.memory.storageFraction控制的区域之间进行了划分。

您应该考虑再次观看作者Andrew Or的幻灯片 Memory Management in Apache Spark和作者本​​人的视频 Deep Dive: Apache Spark Memory Management

您可能需要阅读如何在 How does web UI calculate Storage Memory (in Executors tab)?中计算存储内存值(在Web UI中和内部)

关于apache-spark - 什么是以及如何在Web UI的“执行程序”选项卡中控制内存存储?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41343456/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com