gpt4 book ai didi

scala - Spark Job通过运行相同的映射3次而不断失败

转载 作者:行者123 更新时间:2023-12-02 21:11:25 24 4
gpt4 key购买 nike

我的工作有一个步骤,其中我将数据帧转换为RDD [(key,value)],但该步骤运行了3次,共3次,第三次被卡住,失败了

Spark UI显示:

积极工作(1)

  Job Id (Job Group)      Description    Submitted  Duration    Stages: Succeeded/Total Tasks (for all stages): Succeeded/Total

3 (zeppelin-20161017-005442_839671900) Zeppelin map at <console>:69 2016/10/25 05:50:02 1.6 min 0/1 210/45623

已完成的工作(2)
  2 (zeppelin-20161017-005442_839671900)    Zeppelin map at <console>:69    2016/10/25 05:16:28     23 min  1/1       46742/46075 (21 failed)
1 (zeppelin-20161017-005442_839671900) Zeppelin map at <console>:69 2016/10/25 04:47:58 17 min 1/1 47369/46795 (20 failed)

这是代码:
 val eventsRDD = eventsDF.map {

r =>
val customerId = r.getAs[String]("customerId")
val itemId = r.getAs[String]("itemId")
val countryId = r.getAs[Long]("countryId").toInt
val timeStamp = r.getAs[String]("eventTimestamp")

val totalRent = r.getAs[Int]("totalRent")
val totalPurchase = r.getAs[Int]("totalPurchase")
val totalProfit = r.getAs[Int]("totalProfit")

val store = r.getAs[String]("store")
val itemName = r.getAs[String]("itemName")

val itemName = if (itemName.size > 0 && itemName.nonEmpty && itemName != null ) itemName else "NA"


(itemId, (customerId, countryId, timeStamp, totalRent, totalProfit, totalPurchase, store,itemName ))



}

有人可以告诉我这里出了什么问题吗?如果我要坚持/缓存我应该做什么?

错误:
16/10/25 23:28:55 INFO YarnClientSchedulerBackend: Asked to remove non-existent executor 181
16/10/25 23:28:55 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1477415847345_0005_02_031011 on host: ip-172-31-14-104.ec2.internal. Exit status: 52. Diagnostics: Exception from container-launch.
Container id: container_1477415847345_0005_02_031011
Exit code: 52
Stack trace: ExitCodeException exitCode=52:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

最佳答案

映射操作会导致某些错误,并传播到驱动程序,从而导致任务失败。

默认情况下,spark.task.maxFailures的值为4,用于:

Number of failures of any particular task before giving up on the job. The total number of failures spread across different tasks will not cause the job to fail; a particular task has to fail this number of attempts. Should be greater than or equal to 1. Number of allowed retries = this value - 1.



因此,当您的任务失败时,spark会尝试重新计算map操作,直到它总共失败4次为止。

如果我想持久化/缓存我应该做哪一个?
cache只是特定的持久操作,其中 RDD以默认存储级别(MEMORY_ONLY)进行持久化。

关于scala - Spark Job通过运行相同的映射3次而不断失败,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40232408/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com