- android - RelativeLayout 背景可绘制重叠内容
- android - 如何链接 cpufeatures lib 以获取 native android 库?
- java - OnItemClickListener 不起作用,但 OnLongItemClickListener 在自定义 ListView 中起作用
- java - Android 文件转字符串
我在 apache spark 中遇到了一个奇怪的问题,我将不胜感激。从 hdfs 读取数据(并进行一些从 json 到对象的转换)后,下一阶段(处理所述对象)在处理完 2 个分区(总共 512 个)后失败。这种情况发生在大型数据集上(我注意到的最小数据集约为 700 兆,但可能会更低,我还没有缩小范围)。
编辑:700 megs 是 tgz 文件大小,未压缩是 6 gigs。
编辑 2:同样的事情发生在 spark 1.1.0
我在一台 32 核、60 演出的机器上使用本地主机运行 spark,设置如下:
spark.akka.timeout = 200
spark.shuffle.consolidateFiles = true
spark.kryoserializer.buffer.mb = 128
spark.reducer.maxMbInFlight = 128
具有 16 gig 执行程序堆大小。内存没有用尽,CPU 负载可以忽略不计。 Spark 永远挂起。
下面是 Spark 日志:
14/09/11 10:19:52 INFO HadoopRDD: Input split: hdfs://localhost:9000/spew/data/json.lines:6351070299+12428842
14/09/11 10:19:53 INFO Executor: Serialized size of result for 511 is 1263
14/09/11 10:19:53 INFO Executor: Sending result for 511 directly to driver
14/09/11 10:19:53 INFO Executor: Finished task ID 511
14/09/11 10:19:53 INFO TaskSetManager: Finished TID 511 in 868 ms on localhost (progress: 512/512)
14/09/11 10:19:53 INFO DAGScheduler: Completed ShuffleMapTask(3, 511)
14/09/11 10:19:53 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool
14/09/11 10:19:53 INFO DAGScheduler: Stage 3 (mapToPair at Main.java:205) finished in 535.874 s
14/09/11 10:19:53 INFO DAGScheduler: looking for newly runnable stages
14/09/11 10:19:53 INFO DAGScheduler: running: Set()
14/09/11 10:19:53 INFO DAGScheduler: waiting: Set(Stage 0, Stage 1, Stage 2)
14/09/11 10:19:53 INFO DAGScheduler: failed: Set()
14/09/11 10:19:53 INFO DAGScheduler: Missing parents for Stage 0: List(Stage 1)
14/09/11 10:19:53 INFO DAGScheduler: Missing parents for Stage 1: List(Stage 2)
14/09/11 10:19:53 INFO DAGScheduler: Missing parents for Stage 2: List()
14/09/11 10:19:53 INFO DAGScheduler: Submitting Stage 2 (FlatMappedRDD[10] at flatMapToPair at Driver.java:145), which is now runnable
14/09/11 10:19:53 INFO DAGScheduler: Submitting 512 missing tasks from Stage 2 (FlatMappedRDD[10] at flatMapToPair at Driver.java:145)
14/09/11 10:19:53 INFO TaskSchedulerImpl: Adding task set 2.0 with 512 tasks
14/09/11 10:19:53 INFO TaskSetManager: Starting task 2.0:0 as TID 512 on executor localhost: localhost (PROCESS_LOCAL)
14/09/11 10:19:53 INFO TaskSetManager: Serialized task 2.0:0 as 3469 bytes in 0 ms
14/09/11 10:19:53 INFO Executor: Running task ID 512
14/09/11 10:19:53 INFO BlockManager: Found block broadcast_0 locally
14/09/11 10:19:53 INFO BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 134217728, targetRequestSize: 26843545
14/09/11 10:19:53 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 512 non-empty blocks out of 512 blocks
14/09/11 10:19:53 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 6 ms
14/09/11 10:20:07 INFO Executor: Serialized size of result for 512 is 1479
14/09/11 10:20:07 INFO Executor: Sending result for 512 directly to driver
14/09/11 10:20:07 INFO Executor: Finished task ID 512
14/09/11 10:20:07 INFO TaskSetManager: Starting task 2.0:1 as TID 513 on executor localhost: localhost (PROCESS_LOCAL)
14/09/11 10:20:07 INFO TaskSetManager: Serialized task 2.0:1 as 3469 bytes in 0 ms
14/09/11 10:20:07 INFO Executor: Running task ID 513
14/09/11 10:20:07 INFO TaskSetManager: Finished TID 512 in 13996 ms on localhost (progress: 1/512)
14/09/11 10:20:07 INFO DAGScheduler: Completed ShuffleMapTask(2, 0)
14/09/11 10:20:07 INFO BlockManager: Found block broadcast_0 locally
14/09/11 10:20:07 INFO BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 134217728, targetRequestSize: 26843545
14/09/11 10:20:07 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 512 non-empty blocks out of 512 blocks
14/09/11 10:20:07 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 1 ms
14/09/11 10:20:15 INFO Executor: Serialized size of result for 513 is 1479
14/09/11 10:20:15 INFO Executor: Sending result for 513 directly to driver
14/09/11 10:20:15 INFO Executor: Finished task ID 513
14/09/11 10:20:15 INFO TaskSetManager: Starting task 2.0:2 as TID 514 on executor localhost: localhost (PROCESS_LOCAL)
14/09/11 10:20:15 INFO TaskSetManager: Serialized task 2.0:2 as 3469 bytes in 0 ms
14/09/11 10:20:15 INFO Executor: Running task ID 514
14/09/11 10:20:15 INFO TaskSetManager: Finished TID 513 in 7768 ms on localhost (progress: 2/512)
14/09/11 10:20:15 INFO DAGScheduler: Completed ShuffleMapTask(2, 1)
14/09/11 10:20:15 INFO BlockManager: Found block broadcast_0 locally
14/09/11 10:20:15 INFO BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 134217728, targetRequestSize: 26843545
14/09/11 10:20:15 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 512 non-empty blocks out of 512 blocks
14/09/11 10:20:15 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 1 ms
1) DAGScheduler: failed: Set()
是什么意思?我认为它并不重要,因为它是 INFO 级别,但你永远不知道。
2) Missing parents
是什么意思?同样,它是信息。
这是 jstack 的输出:
"Service Thread" #20 daemon prio=9 os_prio=0 tid=0x00007f39400ff000 nid=0x10560 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread14" #19 daemon prio=9 os_prio=0 tid=0x00007f39400fa000 nid=0x1055f waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread13" #18 daemon prio=9 os_prio=0 tid=0x00007f39400f8000 nid=0x1055e waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread12" #17 daemon prio=9 os_prio=0 tid=0x00007f39400f6000 nid=0x1055d waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread11" #16 daemon prio=9 os_prio=0 tid=0x00007f39400f4000 nid=0x1055c waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread10" #15 daemon prio=9 os_prio=0 tid=0x00007f39400f1800 nid=0x1055b waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread9" #14 daemon prio=9 os_prio=0 tid=0x00007f39400ef800 nid=0x1055a waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread8" #13 daemon prio=9 os_prio=0 tid=0x00007f39400ed800 nid=0x10559 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread7" #12 daemon prio=9 os_prio=0 tid=0x00007f39400eb800 nid=0x10558 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread6" #11 daemon prio=9 os_prio=0 tid=0x00007f39400e9800 nid=0x10557 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread5" #10 daemon prio=9 os_prio=0 tid=0x00007f39400e7800 nid=0x10556 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread4" #9 daemon prio=9 os_prio=0 tid=0x00007f39400dd000 nid=0x10555 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00007f39400db000 nid=0x10554 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f39400d8800 nid=0x10553 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f39400d7000 nid=0x10552 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f39400d4000 nid=0x10551 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f39400d2000 nid=0x10550 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f39400a2800 nid=0x1054f in Object.wait() [0x00007f38d39f8000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142)
- locked <0x00000000e0038180> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f39400a0800 nid=0x1054e in Object.wait() [0x00007f38d3af9000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:157)
- locked <0x00000000e00161b8> (a java.lang.ref.Reference$Lock)
"main" #1 prio=5 os_prio=0 tid=0x00007f394000a000 nid=0x10535 in Object.wait() [0x00007f3945ee1000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000e03df000> (a org.apache.spark.scheduler.JobWaiter)
at java.lang.Object.wait(Object.java:502)
at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)
- locked <0x00000000e03df000> (a org.apache.spark.scheduler.JobWaiter)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:452)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1051)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1069)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1083)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1097)
at org.apache.spark.rdd.RDD.foreach(RDD.scala:716)
at org.apache.spark.api.java.JavaRDDLike$class.foreach(JavaRDDLike.scala:294)
at org.apache.spark.api.java.JavaPairRDD.foreach(JavaPairRDD.scala:44)
at spew.Driver.run(Driver.java:88)
at spew.Main.main(Main.java:92)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
"VM Thread" os_prio=0 tid=0x00007f3940099800 nid=0x1054d runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f394001f800 nid=0x10536 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f3940021000 nid=0x10537 runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f3940023000 nid=0x10538 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f3940024800 nid=0x10539 runnable
"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007f3940026800 nid=0x1053a runnable
"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007f3940028000 nid=0x1053b runnable
"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007f394002a000 nid=0x1053c runnable
"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007f394002b800 nid=0x1053d runnable
"GC task thread#8 (ParallelGC)" os_prio=0 tid=0x00007f394002d000 nid=0x1053e runnable
"GC task thread#9 (ParallelGC)" os_prio=0 tid=0x00007f394002f000 nid=0x1053f runnable
"GC task thread#10 (ParallelGC)" os_prio=0 tid=0x00007f3940030800 nid=0x10540 runnable
"GC task thread#11 (ParallelGC)" os_prio=0 tid=0x00007f3940032800 nid=0x10541 runnable
"GC task thread#12 (ParallelGC)" os_prio=0 tid=0x00007f3940034000 nid=0x10542 runnable
"GC task thread#13 (ParallelGC)" os_prio=0 tid=0x00007f3940036000 nid=0x10543 runnable
"GC task thread#14 (ParallelGC)" os_prio=0 tid=0x00007f3940037800 nid=0x10544 runnable
"GC task thread#15 (ParallelGC)" os_prio=0 tid=0x00007f3940039800 nid=0x10545 runnable
"GC task thread#16 (ParallelGC)" os_prio=0 tid=0x00007f394003b000 nid=0x10546 runnable
"GC task thread#17 (ParallelGC)" os_prio=0 tid=0x00007f394003d000 nid=0x10547 runnable
"GC task thread#18 (ParallelGC)" os_prio=0 tid=0x00007f394003e800 nid=0x10548 runnable
"GC task thread#19 (ParallelGC)" os_prio=0 tid=0x00007f3940040800 nid=0x10549 runnable
"GC task thread#20 (ParallelGC)" os_prio=0 tid=0x00007f3940042000 nid=0x1054a runnable
"GC task thread#21 (ParallelGC)" os_prio=0 tid=0x00007f3940044000 nid=0x1054b runnable
"GC task thread#22 (ParallelGC)" os_prio=0 tid=0x00007f3940045800 nid=0x1054c runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f3940102000 nid=0x10561 waiting on condition
JNI global references: 422
有没有人遇到过这样的 Spark 问题?这很奇怪,因为它适用于小型(微型)数据集(测试装置等)。
最佳答案
我不回答这么老的问题spark版本,但我知道当应用程序挂起时,可能是由于您的资源被终止(例如 yarn)。
我在 Is Spark's KMeans unable to handle bigdata? 中遇到了类似的问题你能做的最好的事情就是微调你的应用程序,因为你的问题中没有任何信息可以建议如何解决这个问题。
您还可以根据经验微调分区的数量。
对于另一项我必须微调以扩展到 15T 数据的工作,我在 memoryOverhead issue in Spark 中报告了我的方法。 ,但我不知道这是否相关。
正如 Karl Higley 所建议的,我同意:
有向无环图 (DAG) 调度失败意味着失败阶段集为空(即还没有任何失败。)。
Missing parents 是阶段的列表,其结果需要计算请求的结果并且尚未缓存在内存中。
关于hadoop - Spark 1.0.2(也是 1.1.0)卡在一个分区上,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25787639/
是一种在 Neo4j 分区之间进行物理分离的方法吗? 这意味着以下查询将转到 node1: Match (a:User:Facebook) 虽然此查询将转到另一个节点(可能托管在 docker 上)
我尝试在我的 SQL 服务器上使用分区函数对我的一个大表进行分区,但我收到一条错误消息 “只能在SQL Server企业版中创建分区功能。只有SQL Server企业版支持分区。” 所以我想知道没有企
在hadoop文件系统中,我有两个文件,分别是X和Y。通常,hadoop制作的文件X和Y的大小为64 MB。是否可以强制hadoop划分两个文件,以便从X的32 MB和Y的32 MB中创建一个64 M
据我了解,如果我们有一个主键,则使用该键对数据进行分区并将其存储在节点中(例如使用随机分区器)。 现在我不确定的是,如果我有多个键(又名复合键),是用于分区数据的键的组合还是它将是第一个主键? 例如,
我正在向我的 SSAS 多维数据集添加分区,我想知道是否有多个分区可以保留在下面?多少太多了,最佳实践限制是 20 还是 200?有没有人可以分享任何真实世界的知识? 最佳答案 这是 another
我有一个包含大约 200 万条记录的大表,我想对其进行分区。 我将 id 列设置为 PRIMARY AUTO_INCRMENT int (并且它必须始终是唯一的)。我有一列“theyear”int(4
我正在做 mysql 列表分区。我的表数据如下 ---------------------------------------- id | unique_token | city | student_
我有一个表,我们每天在其中插入大约 2000 万个条目(没有任何限制的盲插入)。我们有两个外键,其中一个是对包含大约 1000 万个条目的表的引用 ID。 我打算删除此表中超过一个月的所有数据,因为不
我想在一款足球奇幻游戏中尝试使用 MySQL Partitioning,该游戏的用户分布在联赛中,每个联赛都有一个用户可以买卖球员的市场。当很多用户同时玩时,我在这张表中遇到了一些僵局(在撰写本文时大
我是 jQuery 的新手,想知道是否可以获取一些变量并将它们的除法作为 CSS 宽度。到目前为止我在这里: var x = $(".some-container").length; var y =
所以我正在做家庭作业,我需要为分区、斯特林数(第一类和第二类)和第一类的切比雪夫多项式创建递归函数。我的程序应该能够让用户输入一个正整数 n,然后创建名为 Partitions.txt、Stirlin
我在数据框中有一列,其中包含大约 1,4M 行聊天对话,其中每个单元格中的一般格式为 (1): “名称代理 : 对话” 但是,并非列中的所有单元格都采用这种格式。有些单元格只是 (2): “对话” 我
我在尝试隐藏 a 时遇到了一些问题,直到用户单击某个元素为止。 HTML 看起来像: BRAND item 1 item 2 item 3
一.为什么kafka要做分区? 因为当一台机器有可能扛不住(类比:就像redis集群中的redis-cluster一样,一个master抗不住写,那么就多个master去抗写)
我有一些销售数据,我需要发送存储在单独表中的可用槽中的数量。 销售数据示例: id数量112131415369 create table sales (id serial primary key, q
我计划设置多个节点以使用 glusterfs 创建分布式复制卷 我使用主(也是唯一)分区上的目录在两个节点上创建了一个 gluster 复制卷。 gluster volume create vol_d
我正在尝试使用 sum() over (partition by) 但在总和中过滤。我的用例是将每个产品的 12 个月累计到一个月的条目,因此: ITEM MONTH SALES Item
是否可以创建多个 Enumerators出单Enumerator ? 我正在寻找的相当于 List.partition返回 (List[A], List[A]) ,比如 List().partitio
我正在创建一个基于 x86 的非常简单的 Yocto 图像。 我希望/文件系统是只读的,所以我设置了 IMAGE_FEATURES_append = " read-only-rootfs " 在原件的
是否可以使用一次 collect 调用来创建 2 个新列表?如果没有,我该如何使用分区来做到这一点? 最佳答案 collect(在TraversableLike上定义并在所有子类中可用)与集合和Par
我是一名优秀的程序员,十分优秀!