apache-spark - Spark 抛出 java.io.IOException : Failed to rename when saving part-xxxxx. gz-6ren

apache-spark - Spark 抛出 java.io.IOException : Failed to rename when saving part-xxxxx. gz

转载作者：行者123 更新时间：2023-12-02 11:24:52

新的 Spark 用户在这里。我从存储在 AWS S3 上的许多 .tif 图像中提取特征，每个图像都有像 02_R4_C7 这样的标识符。我正在使用 Spark 2.2.1 和 hadoop 2.7.2。

我正在使用所有默认配置，如下所示:

conf = SparkConf().setAppName("Feature Extraction")
sc = SparkContext(conf=conf)
sc.setLogLevel("ERROR")
sqlContext = SQLContext(sc)

这是在某些功能作为 part-xxxx.gz 文件成功保存在图像 id 文件夹中后失败的函数调用:
features_labels_rdd.saveAsTextFile(text_rdd_direct,"org.apache.hadoop.io.compress.GzipCodec")
请参阅下面的错误。当我删除成功创建的特征 part-xxxxx.gz 文件并重新运行脚本时，它以一种看似不确定的方式在不同的图像和 part-xxxxx.gz 处失败。我确保在重新运行之前删除所有功能。我的理论是，两个 worker 试图创建相同的临时文件并且相互冲突，因为同一个文件有两个相同的错误消息，但相隔一秒钟。

我不知道该怎么做，我已经看到了 Spark 列表 configurations这可以改变 spark 处理任务的方式，但我不确定这里有什么帮助，因为我不明白我遇到的问题。任何帮助是极大的赞赏!

SLF4J: Class path contains multiple SLF4J bindings.
*SLF4J: Found binding in [jar:file:/usr/local/spark/jars/slf4j- 
log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/06/26 19:24:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/06/26 19:24:41 WARN spark.SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
n images = 512
 Feature file of 02_R4_C7 is created                                            
[Stage 3:=================>                                       (6 + 14) / 20]18/06/26 19:24:58 ERROR mapred.SparkHadoopMapRedUtil: Error committing the output of task: attempt_20180626192453_0003_m_000007_59
java.io.IOException: Failed to rename FileStatus{path=s3n://activemapper/imagery/southafrica/wv2/RDD48FeaturesTextFile/02_R4_C6/_temporary/0/_temporary/attempt_20180626192453_0003_m_000007_59/part-00007.gz; isDirectory=false; length=952309; replication=1; blocksize=67108864; modification_time=1530041098000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to s3n://activemapper/imagery/southafrica/wv2/RDD48FeaturesTextFile/02_R4_C6/part-00007.gz
    at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:415)
    at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:428)
    at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:539)
    at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:172)
    at org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:343)
    at org.apache.spark.mapred.SparkHadoopMapRedUtil$.performCommit$1(SparkHadoopMapRedUtil.scala:50)
    at org.apache.spark.mapred.SparkHadoopMapRedUtil$.commitTask(SparkHadoopMapRedUtil.scala:76)
    at org.apache.spark.internal.io.SparkHadoopWriter.commit(SparkHadoopWriter.scala:105)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1146)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1125)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
[Stage 3:=====================================>                   (13 + 7) / 20]18/06/26 19:24:58 ERROR executor.Executor: Exception in task 7.0 in stage 3.0 (TID 59)
java.io.IOException: Failed to rename FileStatus{path=s3n://activemapper/imagery/southafrica/wv2/RDD48FeaturesTextFile/02_R4_C6/_temporary/0/_temporary/attempt_20180626192453_0003_m_000007_59/part-00007.gz; isDirectory=false; length=952309; replication=1; blocksize=67108864; modification_time=1530041098000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to s3n://activemapper/imagery/southafrica/wv2/RDD48FeaturesTextFile/02_R4_C6/part-00007.gz
    at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:415)
    at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:428)
    at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:539)
    at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:172)
    at org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:343)
    at org.apache.spark.mapred.SparkHadoopMapRedUtil$.performCommit$1(SparkHadoopMapRedUtil.scala:50)
    at org.apache.spark.mapred.SparkHadoopMapRedUtil$.commitTask(SparkHadoopMapRedUtil.scala:76)
    at org.apache.spark.internal.io.SparkHadoopWriter.commit(SparkHadoopWriter.scala:105)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1146)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1125)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
18/06/26 19:24:58 ERROR scheduler.TaskSetManager: Task 7 in stage 3.0 failed 1 times; aborting job
Traceback (most recent call last):
  File "run_feature_extraction_spark.py", line 88, in <module>
    main(sc)
  File "run_feature_extraction_spark.py", line 75, in main
    features_labels_rdd.saveAsTextFile(text_rdd_direct, "org.apache.hadoop.io.compress.GzipCodec")
  File "/home/ubuntu/.local/lib/python2.7/site-packages/pyspark/rdd.py", line 1551, in saveAsTextFile
    keyed._jrdd.map(self.ctx._jvm.BytesToString()).saveAsTextFile(path, compressionCodec)
  File "/home/ubuntu/.local/lib/python2.7/site-packages/py4j/java_gateway.py", line 1133, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/home/ubuntu/.local/lib/python2.7/site-packages/pyspark/sql/utils.py", line 63, in deco
    return f(*a, **kw)
  File "/home/ubuntu/.local/lib/python2.7/site-packages/py4j/protocol.py", line 319, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o76.saveAsTextFile.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 3.0 failed 1 times, most recent failure: Lost task 7.0 in stage 3.0 (TID 59, localhost, executor driver): java.io.IOException: Failed to rename FileStatus{path=s3n://activemapper/imagery/southafrica/wv2/RDD48FeaturesTextFile/02_R4_C6/_temporary/0/_temporary/attempt_20180626192453_0003_m_000007_59/part-00007.gz; isDirectory=false; length=952309; replication=1; blocksize=67108864; modification_time=1530041098000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to s3n://activemapper/imagery/southafrica/wv2/RDD48FeaturesTextFile/02_R4_C6/part-00007.gz*

当我再次运行它时，脚本使它运行得更远，但使用不同的图像文件夹和 part-xxxx.gz 文件失败并出现相同的错误

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/06/26 19:37:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/06/26 19:37:24 WARN spark.SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
n images = 512

 Feature file of 02_R4_C7 is created                                            
 Feature file of 02_R4_C6 is created                                            
 Feature file of 02_R4_C5 is created                                            
 Feature file of 02_R4_C4 is created                                            
 Feature file of 02_R4_C3 is created                                            
 Feature file of 02_R4_C2 is created                                            
 Feature file of 02_R4_C1 is created                                            
[Stage 15:==========================================>             (15 + 5) / 20]18/06/26 19:38:16 ERROR mapred.SparkHadoopMapRedUtil: Error committing the output of task: attempt_20180626193811_0015_m_000017_285
java.io.IOException: Failed to rename FileStatus{path=s3n://activemapper/imagery/southafrica/wv2/RDD48FeaturesTextFile/02_R4_C0/_temporary/0/_temporary/attempt_20180626193811_0015_m_000017_285/part-00017.gz; isDirectory=false; length=896020; replication=1; blocksize=67108864; modification_time=1530041897000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to s3n://activemapper/imagery/southafrica/wv2/RDD48FeaturesTextFile/02_R4_C0/part-00017.gz

最佳答案

如果没有“一致性层”(Consistent EMR，或来自 Apache Hadoop 项目本身，S3Guard)或专门为使用 S3 设计的特殊输出提交者(Hadoop 3.1+“the S3A 提交者”)。重命名是失败的地方，因为列表不一致意味着要复制的文件扫描可能会丢失数据，或者找到无法重命名的已删除文件。您的堆栈跟踪看起来完全符合我的预期:作业提交显然是随机失败的。

与其深入细节，不如看视频 Ryan Blue on the topic

解决方法:写入本地集群 FS，然后使用 distcp 上传到 S3。

PS:对于 Hadoop 2.7+，切换到 s3a://连接器。它在没有启用 S3Guard 的情况下具有完全相同的一致性问题，但性能更好。

关于apache-spark - Spark 抛出 java.io.IOException : Failed to rename when saving part-xxxxx. gz，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51050591/

文章推荐： asciidoc - 如何将 asciidoc 转换为 pdf？

文章推荐： python - 在测试和训练数据集中使用基于时间的拆分来拆分数据

文章推荐： makefile - 头文件改变时如何使Makefile重新编译？

python - 为什么不在一个模型上调用 save() 而在另一个模型上调用 save()
我正在从tangowithdjango学习django 。我试图理解 populate_rango.py 的代码。代码是: import os os.environ.setdefault('DJANG
保留参数: --save vs --no-save vs --vanilla
我试图理解Rserve参数--save，-no-save和--vanilla之间的区别。我在文档或任何论坛中都没有看到任何描述这些效果的内容。有谁确切地知道这些是做什么的？在OSX中，我需要指定其中
cocoa - 核心数据 : document won't save after initial save
我正在使用 CoreData 制作一个基于文档的应用程序。我可以创建一个新文档，编辑该文档，然后保存它。文件已创建并可以打开。打开后，数据会正确加载。但是，一旦进行了初始保存，所有后续保存都不会执行任
Java网络驱动程序: How to save the page same as "save page as" in firefox?
下面提出了类似的问题 How to save complete web page 但目前还没有答案。预期的结果是得到很多文件，一些文件来存储图像等。我使用了以下内容，它会弹出一个窗口说保存文件 va
Grails .save(刷新 : true) behaves the same with . save()
我们一直在测试一种不同的保存方式。然而，结果并不像我们预期的那样。我们有创建调查的方法，每个调查有多个问题。我们测试了几个案例，它们都以相同的方式提交查询。 @Transactional class
java - 问 : JPA save() save other object
我想了解JAP Repotitoty的详细信息。我创建了一个服务类、实体类和存储库类，如下所示(用 kotlin 编写)并执行了 ItemService#update 方法。执行 item2Repo
javascript - 火狐存储API : basic data save not saving
我正在开发我的第一个 Firefox 扩展。我正在尝试将数据保存在浏览器的本地存储中(使用 Window.localStorage 很容易，但我正在关注 official recommandation
C# Windows 窗体 : Save & Save As Woes
这让我很郁闷。我是 C Sharp 的新手，因此需要一些帮助。我的保存/另存为完全是胡说八道。真的有两个问题: 如何在不弹出保存对话框的情况下保存对现有文件的更改？如果我单击“保存”，它会弹出一个对
php - DOMDocument::save[domdocument.save]:无法打开流:权限被拒绝
我有一个代码可以将 XML 文件保存到我的目录中。它在我的本地主机和我的共享主机中实际上就像一个魅力，但它在我的 Linux VPS 中不起作用。我总是遇到这个错误: 警告:DOMDocument:
django - 管理中的 "Save as"和 "Save and add another"
有没有办法在 django 管理站点中同时“另存为”和“保存并添加另一个”？最佳答案我不认为按钮引用的 URL 有任何神奇之处，因此您可以通过简单地覆盖每个 http://docs.djangop
playframework - save() 和 save() 在 play 框架中建模方法
创建 playramework 的模型时，我们可以使用 save() 或 _save() 方法。为什么这两种方法在框架中都可用，原因是什么？ (在这种情况下，他们做同样的事情 - 将对象保存到数据库)
angularjs - .save 和 $save 到 angularjs 中资源的区别
我见过两个都调用 $save 的代码和 save到 $resource 的 Angular 。有什么区别，你什么时候使用？最佳答案最佳解释===例子 : // by writing '{ id:
ruby-on-rails-3 - 何时在模型中使用 `save` 与 `save!`？
根据save bang your head, active record will drive you mad ，在特殊情况下我们应该避免使用 save! 和 rescue 习惯用法。鉴于此，假设模型
java - 如何修复我的代码，使其包含 "save"和 "save as"函数？
我的菜单栏中有两个按钮，其中包含“保存”和“另存为”按钮。但是，我目前拥有它们相同的代码，并且它会按当前方式保存，并提示用户要保存在哪里。我希望保存按钮仅保存而不提示对话框，除非文件尚不存在。我尝试
python - Model.save() 和 ModelForm.save() 按什么顺序调用
我知道 models.Model 和 forms.ModelForm 都包含您可以覆盖的 .save() 方法。我的问题是它们如何以及何时用于保存对象以及以什么顺序。最佳答案 ModelForm.s
python - 值错误: No variables to save when saving Tensorflow checkpoint
我一直在尝试使用 freeze_graph函数来获取模型+权重/偏差，但在这个过程中，我发现我的初始网络似乎没有任何变量，尽管能够正确分类图像。我的代码如下: #!/usr/bin/python im
python - gTTS 错误 : saving as wav but saved as MPEG
尝试使用 gTTS 模块将文本转换为语音并另存为 wav 文件。我的代码: import gTTS text = "This is my text in the saving folder" tts
php - 使用两个按钮 "save"和 "save & submit"提交表单？
我有一个包含大约 50 个字段和两个提交按钮的表单，“保存”和“保存并提交”。如果用户单击“保存”，则插入用户在表格中填写的值。当用户单击“保存并提交”按钮时，它应该更新或插入用户在表单中填写的所有字
python - Model.save() 和 ModelForm.save() 一起工作
我是 Django 新手。我想知道 django 中的 ModelForm 和 Model 如何协同工作？我的意思是 ModelForm.save() 如何自动保存与之关联的模型？它如何从 reque
java - hibernate :Save Parent automatically before saving Child
我有亲子关系: @Entity @Table(name = "user") public final class User { @Id @GeneratedValue(strategy

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

apache-spark - Spark 抛出 java.io.IOException : Failed to rename when saving part-xxxxx. gz