gpt4 book ai didi

apache-spark - 简单的 rdd.count() 操作的 java.lang.OutOfMemoryError

转载 作者:可可西里 更新时间:2023-11-01 14:53:10 26 4
gpt4 key购买 nike

我在对 hdfs 上的大约 55 个文件和总共 1B 条记录进行简单计数操作时遇到了很多麻烦。 spark-shell 和 PySpark 都因 OOM 错误而失败。我正在使用 yarn、MapR、Spark 1.3.1 和 hdfs 2.4.1。 (它在本地模式下也失败了。)我尝试遵循调整和配置建议,向执行程序投入越来越多的内存。我的配置是

conf = (SparkConf()
.setMaster("yarn-client")
.setAppName("pyspark-testing")
.set("spark.executor.memory", "6g")
.set("spark.driver.memory", "6g")
.set("spark.executor.instances", 20)
.set("spark.yarn.executor.memoryOverhead", "1024")
.set("spark.yarn.driver.memoryOverhead", "1024")
.set("spark.yarn.am.memoryOverhead", "1024")
)
sc = SparkContext(conf=conf)
sc.textFile('/data/on/hdfs/*.csv').count() # fails every time

作业被分成 893 个任务,在成功完成大约 50 个任务后,许多任务开始失败。我在应用程序的 stderr 中看到了 ExecutorLostFailure。挖掘执行程序日志时,我看到如下错误:

15/06/24 16:54:07 ERROR util.Utils: Uncaught exception in thread stdout writer for /work/analytics2/analytics/python/envs/santon/bin/python
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:57)
at java.nio.CharBuffer.allocate(CharBuffer.java:331)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:792)
at org.apache.hadoop.io.Text.decode(Text.java:406)
at org.apache.hadoop.io.Text.decode(Text.java:383)
at org.apache.hadoop.io.Text.toString(Text.java:281)
at org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:558)
at org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:558)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:379)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$1.apply$mcV$sp(PythonRDD.scala:242)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$1.apply(PythonRDD.scala:204)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$1.apply(PythonRDD.scala:204)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1550)
at org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:203)
15/06/24 16:54:07 ERROR util.SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[stdout writer for /work/analytics2/analytics/python/envs/santon/bin/python,5,main]
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:57)
at java.nio.CharBuffer.allocate(CharBuffer.java:331)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:792)
at org.apache.hadoop.io.Text.decode(Text.java:406)
at org.apache.hadoop.io.Text.decode(Text.java:383)
at org.apache.hadoop.io.Text.toString(Text.java:281)
at org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:558)
at org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:558)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:379)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$1.apply$mcV$sp(PythonRDD.scala:242)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$1.apply(PythonRDD.scala:204)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$1.apply(PythonRDD.scala:204)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1550)
at org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:203)
15/06/24 16:54:07 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM

标准输出中:

# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill %p"
# Executing /bin/sh -c "kill 16490"...

总的来说,我想我理解 OOM 错误和故障排除,但我在概念上被困在这里。这只是一个简单的计数。我不明白当执行者有 ~3G 堆时,Java 堆怎么可能会溢出。有没有人遇到过这个或有任何指示?幕后有什么事情可以阐明这个问题吗?

更新:

我还注意到,通过为相同数量的任务 (893) 指定并行度(例如 sc.textFile(..., 1000)),那么创建的作业有 920任务,除最后一项外,所有任务均已完成且没有错误。然后最后一个任务无限期挂起。这看起来非常奇怪!

最佳答案

事实证明,我遇到的问题实际上与损坏的单个文件有关。在文件上运行简单的 catwc -l 会导致终端挂起。

关于apache-spark - 简单的 rdd.count() 操作的 java.lang.OutOfMemoryError,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31058860/

26 4 0
文章推荐: javascript - 将 contenteditable 和覆盖换行符设置为
而不是
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com