gpt4 book ai didi

apache-spark - 如果Spark支持内存溢出到磁盘,Spark Out of Memory怎么会发生?

转载 作者:行者123 更新时间:2023-12-02 00:20:19 26 4
gpt4 key购买 nike

我阅读了一些关于 Spark 内存管理的文档。

在此页面上:What will spark do if I don't have enough memory? .它说:

Spark stores partitions in LRU cache in memory. When cache hits its limit in size, it evicts the entry (i.e. partition) from it. When the partition has “disk” attribute (i.e. your persistence level allows storing partition on disk), it would be written to HDD and the memory consumed by it would be freed, unless you would request it. When you request it, it would be read into the memory, and if there won’t be enough memory some other, older entries from the cache would be evicted. If your partition does not have “disk” attribute, eviction would simply mean destroying the cache entry without writing it to HDD.

那么如果内存不够分区会溢出到磁盘,那么Spark运行时怎么会出现内存不足的问题呢?

最佳答案

Spark 只能驱逐缓存的 RDD block 。也就是说,如果有应用程序标记为存储在内存中的 RDD。因此可以清除存储器的存储部分但不能清除执行部分。 Spark Memory Management指出

Execution memory refers to that used for computation in shuffles, joins, sorts and aggregations.

是否可以驱逐他们

Storage may not evict execution due to complexities in implementation.

如果JVM可用的内存量小于所需的执行内存,则必然会发生OOM。

关于apache-spark - 如果Spark支持内存溢出到磁盘,Spark Out of Memory怎么会发生?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55605506/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com