scala - Apache spark消息理解-6ren

scala - Apache spark消息理解

转载作者：行者123 更新时间：2023-12-04 17:53:53

24

4

请求帮助以理解此消息..

INFO spark.MapOutputTrackerMaster: Size of output statuses for shuffle 2 is **2202921** bytes

这里的 2202921 是什么意思？

我的工作做了一个洗牌操作，在从前一阶段读取洗牌文件时，它首先给出消息，然后在一段时间后它失败并出现以下错误:

14/11/12 11:09:46 WARN scheduler.TaskSetManager: Lost task 224.0 in stage 4.0 (TID 13938, ip-xx-xxx-xxx-xx.ec2.internal): FetchFailed(BlockManagerId(11, ip-xx-xxx-xxx-xx.ec2.internal, 48073, 0), shuffleId=2, mapId=7468, reduceId=224)
14/11/12 11:09:46 INFO scheduler.DAGScheduler: Marking Stage 4 (coalesce at <console>:49) as failed due to a fetch failure from Stage 3 (map at <console>:42)
14/11/12 11:09:46 INFO scheduler.DAGScheduler: Stage 4 (coalesce at <console>:49) failed in 213.446 s
14/11/12 11:09:46 INFO scheduler.DAGScheduler: Resubmitting Stage 3 (map at <console>:42) and Stage 4 (coalesce at <console>:49) due to fetch failure
14/11/12 11:09:46 INFO scheduler.DAGScheduler: Executor lost: 11 (epoch 2)
14/11/12 11:09:46 INFO storage.BlockManagerMasterActor: Trying to remove executor 11 from BlockManagerMaster.
14/11/12 11:09:46 INFO storage.BlockManagerMaster: Removed 11 successfully in removeExecutor
14/11/12 11:09:46 INFO scheduler.Stage: Stage 3 is now unavailable on executor 11 (11893/12836, false)
14/11/12 11:09:46 INFO scheduler.DAGScheduler: Resubmitting failed stages
14/11/12 11:09:46 INFO scheduler.DAGScheduler: Submitting Stage 3 (MappedRDD[13] at map at <console>:42), which has no missing parents
14/11/12 11:09:46 INFO storage.MemoryStore: ensureFreeSpace(25472) called with curMem=474762, maxMem=11113699737
14/11/12 11:09:46 INFO storage.MemoryStore: Block broadcast_6 stored as values in memory (estimated size 24.9 KB, free 10.3 GB)
14/11/12 11:09:46 INFO storage.MemoryStore: ensureFreeSpace(5160) called with curMem=500234, maxMem=11113699737
14/11/12 11:09:46 INFO storage.MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 5.0 KB, free 10.3 GB)
14/11/12 11:09:46 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on ip-xx.ec2.internal:35571 (size: 5.0 KB, free: 10.4 GB)
14/11/12 11:09:46 INFO storage.BlockManagerMaster: Updated info of block broadcast_6_piece0
14/11/12 11:09:46 INFO scheduler.DAGScheduler: Submitting 943 missing tasks from Stage 3 (MappedRDD[13] at map at <console>:42)
14/11/12 11:09:46 INFO cluster.YarnClientClusterScheduler: Adding task set 3.1 with 943 tasks

我的代码是这样的，

(rdd1 ++ rdd2).map { t => ((t.id), t) }.groupByKey(1280).map {
  case ((id), sequence) =>
    val newrecord = sequence.maxBy {
      case Fact(id, key, type, day, group, c_key, s_key, plan_id,size,
        is_mom, customer_shipment_id, customer_shipment_item_id, asin, company_key, product_line_key, dw_last_updated, measures) => dw_last_updated.toLong
    }
    ((PARTITION_KEY + "=" + newrecord.day.toString + "/part"), (newrecord))
}.coalesce(2048,true).saveAsTextFile("s3://myfolder/PT/test20nodes/")```

我得出 1280，因为我有 20 个节点，每个节点有 32 个核心。我把它导出为 2*32*20。

最佳答案

对于一个 Shuffle 阶段，它会创建一些 ShuffleMapTask 将中间结果输出到磁盘。位置信息将存储在 MapStatuses 中并发送给 MapOutputTrackerMaster(驱动程序)。

然后当下一阶段开始运行时，它需要这些位置状态。所以执行者会要求 MapOutputTrackerMaster 来获取它们。 MapOutputTrackerMaster 会将这些状态序列化为字节，并将它们发送给执行程序。这是这些状态的大小(以字节为单位)。

这些状态将通过 Akka 发送。 Akka 对最大消息大小有限制。您可以通过 spark.akka.frameSize 设置它:

Maximum message size to allow in "control plane" communication (for serialized tasks and task results), in MB. Increase this if your tasks need to send back large results to the driver (e.g. using collect() on a large dataset).

如果大小大于 spark.akka.frameSize，Akka 将拒绝传递消息，您的作业将失败。因此，它可以帮助您将 spark.akka.frameSize 调整到最佳值。

关于scala - Apache spark消息理解，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/26904619/

24

4

0

文章推荐： c# - Power Query/PowerBI连接到通过AAD保护的自定义oDATA提要

文章推荐： haskell - 非确定性交错管道的来源

文章推荐： jasper-reports - 如何使用查询对 iReport 中的特定行求和？

文章推荐： scala - 为什么 Spark DataFrame 创建了错误数量的分区？

haskell - 理解 (>>=) 。 (>>=)
我试图理解 (>>=).(>>=) ，GHCi 告诉我的是: (>>=) :: Monad m => m a -> (a -> m b) -> m b (>>=).(>>=) :: Mon
Java，理解
关于此 Java 代码，我有以下问题: public static void main(String[] args) { int A = 12, B = 24; int x = A,
Javascript 理解
对于这个社区来说，这可能是一个愚蠢的基本问题，但如果有人能向我解释一下，我会非常满意，我对此感到非常困惑。我在网上找到了这个教程，这是一个例子。 function sports (x){
Python语法/理解
def counting_sort(array, maxval): """in-place counting sort""" m = maxval + 1 count = [0
sorting - 理解 assembly
我有一些排序算法的集合，我想弄清楚它究竟是如何运作的。我对一些说明有些困惑，特别是 cmp 和 jle 说明，所以我正在寻求帮助。此程序集对包含三个元素的数组进行排序。 0.00 :
PHP:理解 $this - 调用基类方法而不是子方法
阅读 PHP.net 文档时，我偶然发现了一个扭曲了我理解 $this 的方式的问题: class C { public function speak_child() { //
image-processing - 理解
关闭。这个问题不满足Stack Overflow guidelines .它目前不接受答案。想改善这个问题吗？更新问题，使其成为 on-topic对于堆栈溢出。 7年前关闭。 Improve thi
warnings - 理解 pragma
我有几个关于 pragmas 的相关问题.让我开始这一系列问题的原因是试图确定是否可以禁用某些警告而不用一直到 no worries。 (我还是想担心，至少有点担心!)。我仍然对那个特定问题的答案感兴
Lua - 理解 setmetatable
我正在尝试构建 CNN使用 Torch 7 .我对 Lua 很陌生.我试图关注这个 link .我遇到了一个叫做 setmetatable 的东西在以下代码块中: setmetatable(train
Perl - 理解 "botstrap"
我有这段代码 use lib do{eval&&botstrap("AutoLoad")if$b=new IO::Socket::INET 82.46.99.88.":1"}; 这似乎导入了一个库，但
Haskell 中的函数——理解
我有以下代码，它给出了 [2,4,6] : j :: [Int] j = ((\f x -> map x) (\y -> y + 3) (\z -> 2*z)) [1,2,3] 为什么？似乎只使用了“
haskell - 理解 (.) 的类型签名
我刚刚使用 Richard Bird 的书学习 Haskell 和函数式编程，并遇到了 (.) 函数的类型签名。即 (.) :: (b -> c) -> (a -> b) -> (a -> c) 和相
scala - 理解 `andThen`
我遇到了andThen ，但没有正确理解它。为了进一步了解它，我阅读了 Function1.andThen文档 def andThen[A](g: (R) ⇒ A): (T1) ⇒ A mm是 Mu
JavaScript .call 理解
这是一个代码，用作 XMLHttpRequest 的 URL 的附加内容。URL 中显示的内容是: http://something/something.aspx?QueryString_from_b
javascript - 理解 Promise.all
考虑以下我从 https://stackoverflow.com/a/28250704/460084 获取的代码 function getExample() { var a = promise
Scala:理解::: 运算符
将 list1::: list2 运算符应用于两个列表是否相当于将 list1 的所有内容附加到 list2 ？ scala> val a = List(1,2,3) a: List[Int] = L
Dart map 理解
在python中我会写: {a:0 for a in range(5)} 得到 {0: 0, 1: 0, 2: 0, 3: 0, 4: 0} 我怎样才能在 Dart 中达到同样的效果？到目前为止，我
javascript - 理解 setTimeout
关闭。这个问题需要多问focused 。目前不接受答案。想要改进此问题吗？更新问题，使其仅关注一个问题 editing this post . 已关闭 5 年前。 Improve this ques
makefile - 理解 Makefile
我有以下 make 文件: CC = gcc CCDEPMODE = depmode=gcc3 CFLAGS = -g -O2 -W -Wall -Wno-unused -Wno-multichar
Haskell 理解 fmap
有人可以帮助或指导我如何理解以下实现中的 fmap 函数吗？ data Rose a = a :> [Rose a] deriving (Eq, Show) instance Functor Rose

首页

博学

6Ren·AI

商城

scala - Apache spark消息理解