gpt4 book ai didi

apache-spark - shuffle 文件期间 apache spark(1.6) 作业中的 FileNotFoundException

转载 作者:行者123 更新时间:2023-12-04 05:25:00 25 4
gpt4 key购买 nike

我正在开发 spark 1.6,但由于以下错误而使我的工作失败

java.io.FileNotFoundException: /data/05/dfs/dn/yarn/nm/usercache/willir31/appcache/application_1413512480649_0108/spark-local-20141028214722-43f1/26/shuffle_0_312_0.index (No such file or directory) java.io.FileOutputStream.open(Native Method) java.io.FileOutputStream.(FileOutputStream.java:221) org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123) org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:192) org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$4$$anonfun$apply$2.apply(ExternalSorter.scala:733) org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$4$$anonfun$apply$2.apply(ExternalSorter.scala:732) scala.collection.Iterator$class.foreach(Iterator.scala:727) org.apache.spark.util.collection.ExternalSorter$IteratorForPartition.foreach(ExternalSorter.scala:790) org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$4.apply(ExternalSorter.scala:732) org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$4.apply(ExternalSorter.scala:728) scala.collection.Iterator$class.foreach(Iterator.scala:727) scala.collection.AbstractIterator.foreach(Iterator.scala:1157) org.apache.spark.util.collection.ExternalSorter.writePartitionedFile(ExternalSorter.scala:728) org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:70) org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)



我正在执行连接操作。当我仔细查看错误并检查我的代码时,我发现它在从 dataFrame 写回 CSV 时失败。但我无法摆脱它。我没有使用 hdp,我对所有组件都有单独的安装。

最佳答案

这种类型的错误通常发生在某些任务存在更深层次的问题时,例如显着的数据倾斜。由于您没有提供足够的详细信息(请务必阅读 How To AskHow to create a Minimal, Complete, and Verifiable example )和作业统计信息,我能想到的唯一方法是显着增加随机分区的数量:

sqlContext.setConf("spark.sql.shuffle.partitions", 2048)

关于apache-spark - shuffle 文件期间 apache spark(1.6) 作业中的 FileNotFoundException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38367804/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com