apache-spark - Hive on Spark查询因资源不足而挂起-6ren

apache-spark - Hive on Spark查询因资源不足而挂起

转载作者：行者123 更新时间：2023-12-02 20:27:52

我正在尝试在单个小型虚拟机(4GB内存)上在Spark上设置Hive，但无法获取它来处理查询。

例如，此SELECT max(price) FROM rentflattoday结果如下
查询挂起无限循环时的容器日志:

 2019-02-24 14:41:35 INFO  SignalUtils:54 - Registered signal handler for TERM
2019-02-24 14:41:35 INFO  SignalUtils:54 - Registered signal handler for HUP
2019-02-24 14:41:35 INFO  SignalUtils:54 - Registered signal handler for INT
2019-02-24 14:41:35 INFO  SecurityManager:54 - Changing view acls to: hadoop
2019-02-24 14:41:35 INFO  SecurityManager:54 - Changing modify acls to: hadoop
2019-02-24 14:41:35 INFO  SecurityManager:54 - Changing view acls groups to: 
2019-02-24 14:41:35 INFO  SecurityManager:54 - Changing modify acls groups to: 
2019-02-24 14:41:35 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hadoop); groups with view permissions: Set(); users  with modify permissions: Set(hadoop); groups with modify permissions: Set()
2019-02-24 14:41:36 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-02-24 14:41:37 INFO  ApplicationMaster:54 - Preparing Local resources
2019-02-24 14:41:39 INFO  ApplicationMaster:54 - ApplicationAttemptId: appattempt_1551033757513_0011_000001
2019-02-24 14:41:39 INFO  ApplicationMaster:54 - Starting the user application in a separate Thread
2019-02-24 14:41:39 INFO  ApplicationMaster:54 - Waiting for spark context initialization...
2019-02-24 14:41:39 INFO  RemoteDriver:125 - Connecting to: weirv1:42832
2019-02-24 14:41:39 INFO  HiveConf:187 - Found configuration file file:/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/filecache/28/__spark_conf__.zip/__hadoop_conf__/hive-site.xml
2019-02-24 14:41:40 WARN  HiveConf:5214 - HiveConf of name hive.enforce.bucketing does not exist
2019-02-24 14:41:40 WARN  Rpc:170 - Invalid log level null, reverting to default.
2019-02-24 14:41:41 INFO  SparkContext:54 - Running Spark version 2.4.0
2019-02-24 14:41:41 INFO  SparkContext:54 - Submitted application: Hive on Spark (sessionId = 94aded5e-fbeb-4839-af11-9c5f5902fa0c)
2019-02-24 14:41:41 INFO  SecurityManager:54 - Changing view acls to: hadoop
2019-02-24 14:41:41 INFO  SecurityManager:54 - Changing modify acls to: hadoop
2019-02-24 14:41:41 INFO  SecurityManager:54 - Changing view acls groups to: 
2019-02-24 14:41:41 INFO  SecurityManager:54 - Changing modify acls groups to: 
2019-02-24 14:41:41 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hadoop); groups with view permissions: Set(); users  with modify permissions: Set(hadoop); groups with modify permissions: Set()
2019-02-24 14:41:41 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 37368.
2019-02-24 14:41:41 INFO  SparkEnv:54 - Registering MapOutputTracker
2019-02-24 14:41:41 INFO  SparkEnv:54 - Registering BlockManagerMaster
2019-02-24 14:41:41 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2019-02-24 14:41:41 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2019-02-24 14:41:41 INFO  DiskBlockManager:54 - Created local directory at /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1551033757513_0011/blockmgr-ea75eeb2-fb84-4d22-8f29-ba4283eb5efc
2019-02-24 14:41:42 INFO  MemoryStore:54 - MemoryStore started with capacity 366.3 MB
2019-02-24 14:41:42 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
2019-02-24 14:41:42 INFO  log:192 - Logging initialized @9697ms
2019-02-24 14:41:43 INFO  JettyUtils:54 - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
2019-02-24 14:41:43 INFO  Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-02-24 14:41:43 INFO  Server:419 - Started @10064ms
2019-02-24 14:41:43 INFO  AbstractConnector:278 - Started ServerConnector@5d1faff9{HTTP/1.1,[http/1.1]}{0.0.0.0:33181}
2019-02-24 14:41:43 INFO  Utils:54 - Successfully started service 'SparkUI' on port 33181.
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3e4dde9a{/jobs,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5b4b2d8b{/jobs/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@36f37180{/jobs/job,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@edf8590{/jobs/job/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@c7ad6b5{/stages,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2128c9cb{/stages/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4ceefc2f{/stages/stage,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3fb4ee4{/stages/stage/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@38cfc530{/stages/pool,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7eff0f35{/stages/pool/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4f9d6ef6{/storage,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@16c8958f{/storage/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@50683423{/storage/rdd,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@56e81fbc{/storage/rdd/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@72262149{/environment,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2010a66f{/environment/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@31c84762{/executors,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@27cbab18{/executors/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@64a4eac1{/executors/threadDump,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@41221be4{/executors/threadDump/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@32a2a7f5{/static,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@32d23207{/,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3808225f{/api,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@35b9f8ea{/jobs/job/kill,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@c552738{/stages/stage/kill,null,AVAILABLE,@Spark}
2019-02-24 14:41:43 INFO  SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://weirV1:33181
2019-02-24 14:41:43 INFO  YarnClusterScheduler:54 - Created YarnClusterScheduler
2019-02-24 14:41:43 INFO  SchedulerExtensionServices:54 - Starting Yarn extension services with app application_1551033757513_0011 and attemptId Some(appattempt_1551033757513_0011_000001)
2019-02-24 14:41:43 INFO  Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35541.
2019-02-24 14:41:43 INFO  NettyBlockTransferService:54 - Server created on weirV1:35541
2019-02-24 14:41:43 INFO  BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2019-02-24 14:41:43 INFO  BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, weirV1, 35541, None)
2019-02-24 14:41:43 INFO  BlockManagerMasterEndpoint:54 - Registering block manager weirV1:35541 with 366.3 MB RAM, BlockManagerId(driver, weirV1, 35541, None)
2019-02-24 14:41:43 INFO  BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, weirV1, 35541, None)
2019-02-24 14:41:43 INFO  BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, weirV1, 35541, None)
2019-02-24 14:41:44 INFO  JettyUtils:54 - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
2019-02-24 14:41:44 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5e35b086{/metrics/json,null,AVAILABLE,@Spark}
2019-02-24 14:41:44 INFO  EventLoggingListener:54 - Logging events to hdfs:/spark-event-log/application_1551033757513_0011_1
2019-02-24 14:41:45 INFO  RMProxy:98 - Connecting to ResourceManager at weirv1/80.211.222.23:8030
2019-02-24 14:41:45 INFO  YarnRMClient:54 - Registering the ApplicationMaster
2019-02-24 14:41:45 INFO  ApplicationMaster:54 - 
===============================================================================
YARN executor launch context:
  env:
    CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__
    SPARK_YARN_STAGING_DIR -> hdfs://localhost:9000/user/hadoop/.sparkStaging/application_1551033757513_0011
    SPARK_USER -> hadoop

  command:
    {{JAVA_HOME}}/bin/java \ 
      -server \ 
      -Xmx1024m \ 
      '-Dhive.spark.log.dir=/home/hadoop/spark/logs/' \ 
      -Djava.io.tmpdir={{PWD}}/tmp \ 
      '-Dspark.hadoop.hbase.regionserver.info.port=16030' \ 
      '-Dspark.hadoop.hbase.master.info.port=16010' \ 
      '-Dspark.ui.port=0' \ 
      '-Dspark.hadoop.hbase.rest.port=8080' \ 
      '-Dspark.hadoop.hbase.master.port=16000' \ 
      '-Dspark.hadoop.hbase.regionserver.port=16020' \ 
      '-Dspark.driver.port=37368' \ 
      '-Dspark.hadoop.hbase.status.multicast.address.port=16100' \ 
      -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ 
      -XX:OnOutOfMemoryError='kill %p' \ 
      org.apache.spark.executor.CoarseGrainedExecutorBackend \ 
      --driver-url \ 
      spark://CoarseGrainedScheduler@weirV1:37368 \ 
      --executor-id \ 
      <executorId> \ 
      --hostname \ 
      <hostname> \ 
      --cores \ 
      4 \ 
      --app-id \ 
      application_1551033757513_0011 \ 
      --user-class-path \ 
      file:$PWD/__app__.jar \ 
      1><LOG_DIR>/stdout \ 
      2><LOG_DIR>/stderr

  resources:
    __app__.jar -> resource { scheme: "hdfs" host: "localhost" port: 9000 file: "/user/hadoop/.sparkStaging/application_1551033757513_0011/hive-exec-3.1.1.jar" } size: 40604738 timestamp: 1551037287119 type: FILE visibility: PRIVATE
    __spark_libs__ -> resource { scheme: "hdfs" host: "localhost" port: 9000 file: "/spark-jars-nohive" } size: 0 timestamp: 1550932521588 type: ARCHIVE visibility: PUBLIC
    __spark_conf__ -> resource { scheme: "hdfs" host: "localhost" port: 9000 file: "/user/hadoop/.sparkStaging/application_1551033757513_0011/__spark_conf__.zip" } size: 623550 timestamp: 1551037288226 type: ARCHIVE visibility: PRIVATE

===============================================================================
2019-02-24 14:41:46 INFO  YarnAllocator:54 - Will request 1 executor container(s), each with 4 core(s) and 1194 MB memory (including 170 MB of overhead)
2019-02-24 14:41:46 INFO  YarnSchedulerBackend$YarnSchedulerEndpoint:54 - ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@weirV1:37368)
2019-02-24 14:41:46 INFO  YarnAllocator:54 - Submitted 1 unlocalized container requests.
2019-02-24 14:41:46 INFO  ApplicationMaster:54 - Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
2019-02-24 14:42:13 INFO  YarnClusterSchedulerBackend:54 - SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
2019-02-24 14:42:13 INFO  YarnClusterScheduler:54 - YarnClusterScheduler.postStartHook done
2019-02-24 14:42:13 INFO  SparkContext:54 - Added JAR hdfs://localhost:9000/tmp/hive/hadoop/_spark_session_dir/94aded5e-fbeb-4839-af11-9c5f5902fa0c/hive-exec-3.1.1.jar at hdfs://localhost:9000/tmp/hive/hadoop/_spark_session_dir/94aded5e-fbeb-4839-af11-9c5f5902fa0c/hive-exec-3.1.1.jar with timestamp 1551037333719
2019-02-24 14:42:13 INFO  RemoteDriver:306 - Received job request befdba6d-70e5-4a3b-a08e-564376ba3b47
2019-02-24 14:42:14 INFO  SparkClientUtilities:107 - Copying hdfs://localhost:9000/tmp/hive/hadoop/_spark_session_dir/94aded5e-fbeb-4839-af11-9c5f5902fa0c/hive-exec-3.1.1.jar to /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1551033757513_0011/container_1551033757513_0011_01_000001/tmp/1551037299410-0/hive-exec-3.1.1.jar
2019-02-24 14:42:14 INFO  SparkClientUtilities:71 - Added jar[file:/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1551033757513_0011/container_1551033757513_0011_01_000001/tmp/1551037299410-0/hive-exec-3.1.1.jar] to classpath.
2019-02-24 14:42:16 INFO  deprecation:1173 - mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
2019-02-24 14:42:16 INFO  Utilities:3298 - Processing alias rentflattoday
2019-02-24 14:42:16 INFO  Utilities:3336 - Adding 1 inputs; the first input is hdfs://localhost:9000/user/hive/warehouse/csu.db/rentflattoday
2019-02-24 14:42:16 INFO  SerializationUtilities:569 - Serializing MapWork using kryo
2019-02-24 14:42:17 INFO  Utilities:633 - Serialized plan (via FILE) - name: Map 1 size: 6.57KB
2019-02-24 14:42:18 INFO  MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 1216.3 KB, free 365.1 MB)
2019-02-24 14:42:19 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 85.2 KB, free 365.0 MB)
2019-02-24 14:42:19 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on weirV1:35541 (size: 85.2 KB, free: 366.2 MB)
2019-02-24 14:42:19 INFO  SparkContext:54 - Created broadcast 0 from Map 1
2019-02-24 14:42:19 INFO  Utilities:429 - PLAN PATH = hdfs://localhost:9000/tmp/hive/hadoop/75557489-581b-4292-b43b-1c86c6bcdcb2/hive_2019-02-24_14-41-17_480_8986995693652128044-2/-mr-10004/8b6206d1-557f-4345-ace3-9dfe64d6634b/map.xml
2019-02-24 14:42:19 INFO  CombineHiveInputFormat:477 - Total number of paths: 1, launching 1 threads to check non-combinable ones.
2019-02-24 14:42:19 INFO  CombineHiveInputFormat:413 - CombineHiveInputSplit creating pool for hdfs://localhost:9000/user/hive/warehouse/csu.db/rentflattoday; using filter path hdfs://localhost:9000/user/hive/warehouse/csu.db/rentflattoday
2019-02-24 14:42:20 INFO  FileInputFormat:283 - Total input paths to process : 1
2019-02-24 14:42:20 INFO  CombineFileInputFormat:413 - DEBUG: Terminated node allocation with : CompletedNodes: 1, size left: 0
2019-02-24 14:42:20 INFO  CombineHiveInputFormat:467 - number of splits 1
2019-02-24 14:42:20 INFO  CombineHiveInputFormat:587 - Number of all splits 1
2019-02-24 14:42:20 INFO  SerializationUtilities:569 - Serializing ReduceWork using kryo
2019-02-24 14:42:20 INFO  Utilities:633 - Serialized plan (via FILE) - name: Reducer 2 size: 3.84KB
2019-02-24 14:42:20 INFO  SparkPlan:107 - 

Spark RDD Graph:

(1) Reducer 2 (1) MapPartitionsRDD[4] at Reducer 2 []
 |  Reducer 2 (GROUP, 1) MapPartitionsRDD[3] at Reducer 2 []
 |  ShuffledRDD[2] at Reducer 2 []
 +-(1) Map 1 (1) MapPartitionsRDD[1] at Map 1 []
    |  Map 1 (rentflattoday, 1) HadoopRDD[0] at Map 1 []

2019-02-24 14:42:20 INFO  DAGScheduler:54 - Registering RDD 1 (Map 1)
2019-02-24 14:42:20 INFO  DAGScheduler:54 - Got job 0 (Reducer 2) with 1 output partitions
2019-02-24 14:42:20 INFO  DAGScheduler:54 - Final stage: ResultStage 1 (Reducer 2)
2019-02-24 14:42:20 INFO  DAGScheduler:54 - Parents of final stage: List(ShuffleMapStage 0)
2019-02-24 14:42:20 INFO  DAGScheduler:54 - Missing parents: List(ShuffleMapStage 0)
2019-02-24 14:42:20 INFO  DAGScheduler:54 - Submitting ShuffleMapStage 0 (Map 1 (1) MapPartitionsRDD[1] at Map 1), which has no missing parents
2019-02-24 14:42:21 INFO  MemoryStore:54 - Block broadcast_1 stored as values in memory (estimated size 293.7 KB, free 364.7 MB)
2019-02-24 14:42:21 INFO  MemoryStore:54 - Block broadcast_1_piece0 stored as bytes in memory (estimated size 88.1 KB, free 364.7 MB)
2019-02-24 14:42:21 INFO  BlockManagerInfo:54 - Added broadcast_1_piece0 in memory on weirV1:35541 (size: 88.1 KB, free: 366.1 MB)
2019-02-24 14:42:21 INFO  SparkContext:54 - Created broadcast 1 from broadcast at DAGScheduler.scala:1161
2019-02-24 14:42:21 INFO  DAGScheduler:54 - Submitting 1 missing tasks from ShuffleMapStage 0 (Map 1 (1) MapPartitionsRDD[1] at Map 1) (first 15 tasks are for partitions Vector(0))
2019-02-24 14:42:21 INFO  YarnClusterScheduler:54 - Adding task set 0.0 with 1 tasks
2019-02-24 14:42:36 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:42:51 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:43:06 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:43:21 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:43:36 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:43:51 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:44:06 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:44:21 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:44:36 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:44:51 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:45:06 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:45:21 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-24 14:45:36 WARN  YarnClusterScheduler:66 - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

这是我的hive-site.xml和yarn-site.xml

<configuration>

...

<property>
<name>hive.execution.engine</name>
<value>spark</value>
</property>


<property>
<name>spark.master</name>
<value>yarn</value>
</property>

<property>
<name>spark.submit.deployMode</name>
<value>cluster</value>
</property>

<property>
<name>spark.home</name>
<value>/home/hadoop/spark</value>
</property>

<property>
<name>spark.yarn.archive</name>
<value>hdfs:///spark-jars-nohive</value>
</property>

<property>
<name>spark.queue.name</name>
<value>default</value>
</property>

<property>
<name>spark.eventLog.enabled</name>
<value>true</value>
</property>

<property>
<name>spark.eventLog.dir</name>
<value>hdfs:///spark-event-log</value>
</property>

<property>
<name>spark.serializer</name>
<value>org.apache.spark.serializer.KryoSerializer</value>
</property>

<property>
<name>spark.executor.cores</name>
<value>4</value>
</property>

<property>
<name>spark.executor.instances</name>
<value>1</value>
</property>

<property>
<name>spark.dynamicAllocation.enabled</name>
<value>false</value>
</property>


<property>
<name>spark.executor.memory</name>
<value>1024m</value>
</property>

<property>
<name>spark.executor.memoryOverhead</name>
<value>170m</value>
</property>


</configuration>



<configuration>

<!-- Site specific YARN configuration properties -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
<property>
        <name>yarn.acl.enable</name>
        <value>0</value>
</property>

<property>
        <name>yarn.resourcemanager.hostname</name>
        <value>weirv1</value>
</property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>
    <property>
                <description>Amount of physical memory, in MB, that can be allocated for containers.</description>
                <name>yarn.nodemanager.resource.memory-mb</name>
                <value>3072</value>
        </property>
        <property>
                <description>The minimum allocation size for every container request at the RM, in MBs. Memory requests lower than this won't take effect,
and the specified value will get allocated at minimum.</description>
                <name>yarn.scheduler.minimum-allocation-mb</name>
                <value>1024</value>
        </property>
        <property>
                <description>The maximum allocation size for every container request at the RM, in MBs. Memory requests higher than this won't take effect,
and will get capped to this value.</description>
                <name>yarn.scheduler.maximum-allocation-mb</name>
                <value>3072</value>
        </property>
        <property>
                <name>yarn.app.mapreduce.am.resource.mb</name>
                <value>2048</value>
        </property>
        <property>
                <name>yarn.app.mapreduce.am.command-opts</name>
                <value>-Xmx1638m</value>
        </property>

        <property>
                <name>yarn.nodemanager.vmem-check-enabled</name>
                <value>false</value>
                <description>Whether virtual memory limits will be enforced for containers.</description>
        </property>
        <property>
                <name>yarn.resourcemanager.scheduler.class</name>
                <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
        </property>

        <property>
                <name>yarn.scheduler.fair.user-as-default-queue</name>
                <value>false</value>
        </property>

        <property>
                <name>yarn.scheduler.fair.allocation.file</name>
                <value>/home/hadoop/hadoop/etc/hadoop/fair-scheduler.xml</value>
        </property>

</configuration>

由于我是新手，因此我认为其中一些设置错误/不完整，或者日志中的警告是否仅表示我的计算机内存不足，我应该更改内存设置？

谢谢 :-)

最佳答案

既然我已经弄清楚了，我将其张贴在这里，以防有人偶然发现。看起来机器确实内存不足，将yarn.scheduler.minimum-allocation-mb设置为512，将spark.executor.memory设置为512m很有帮助。

关于apache-spark - Hive on Spark查询因资源不足而挂起，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54855958/

文章推荐： javascript - 谷歌地图 lng + lat 到隐藏字段不起作用

文章推荐： coq - 如何在 Coq 中的列表末尾进行归纳

文章推荐： pyspark - 使用整数与小数值在 Pyspark 中进行过滤

文章推荐： javascript - 如何获取父元素的左侧位置

apache-kafka - Apache Beam over Apache Kafka流处理
在流处理方面，Apache Beam和Apache Kafka之间有什么区别？我也试图掌握技术和程序上的差异。请通过您的经验报告来帮助我理解。最佳答案 Beam是一种API，它以一种统一的方式使
apache-kafka - Apache 点燃与 Apache 卡夫卡
有点n00b的问题。如果我使用 Apache Ignite 进行消息传递和事件处理，是否还需要使用 Kafka？与 Ignite 相比，Kafka 基本上会给我哪些(如果有的话)额外功能？提前致
apache-drill - Apache 元模型与 Apache Drill
Apache MetaModel 是一个数据访问框架，它为发现、探索和查询不同类型的数据源提供了一个通用接口(interface)。 Apache Drill 是一种无架构的 SQL 查询引擎，它通过
apache - Apache 和 Apache Tomcat 的使用区别
Tomcat是一个广泛使用的java web服务器，而Apache也是一个web服务器，它们在实际项目使用中有什么不同？经过一些研究，我有了一个简单的想法，比如， Apache Tomcat Ja
apache - 何时使用 Apache 与 Apache+Tomcat？
既然简单地使用 Apache 就足以运行许多 Web 应用程序，那么人们何时以及为什么除了 Apache 之外还使用 Tomcat？最佳答案 Apache Tomcat是一个网络服务器和 Java
apache - 单个用户下的多个域的目录结构应该是什么？ ( Apache )
我在某个 VPS( friend 的带 cPanel 的 apache 服务器)上有一个帐户，我在那里有一个 public_html 目录。我们有大约 5-6 个网站: /home/myusernam
apache - 将模块加载到 Apache
我目前正在尝试将模块加载到 Apache，使用 cmake 构建。该模块称为 mod_mapcache。它已成功构建并正确安装在/usr/lib/apache2/modules directroy 中
apache - 网址中的问号(Apache)
我对 url 中的问号有疑问。例如:我有 url test.com/controller/action/part_1%3Fpart_2 (其中 %3F 是 url 编码的问号)，并使用此重写规则:R
apache - 使用 Let's encrypt with Apache 和 Apache Tomcat
在同一台机器上，Apache 在端口 80 上运行，Tomcat 在端口 8080 上运行。 Apache 包括 html;css;js;文件并调用 tomcat 服务。基本上 exampledom
apache - Apache 1 和 Apache 2 的区别
Apache 1 和 Apache 2 的分支有什么区别？使用一种或另一种的优点和缺点？似乎 Apache 2 的缺点之一是使用大量内存，但也许它处理请求的速度更快？最有趣的是 Apache 作
apache - 从uri模式确定变量(Apache)
实际上，我们正在使用 Apache 网络服务器来托管我们的 REST-API。脚本是用 Lua 编写的，并使用 mod-lua 映射。例如来自 httpd.conf 的实际片段: [...] Lu
apache - apache、ubuntu中的ServerAlias
我在 apache 上的 ubuntu 中有一个虚拟主机，这不是我的主要配置，我有另一个网页作为我的主要网页，所以我想使用虚拟主机在同一个 IP 上设置这个。 urologyexpert.mx 是我的
apache-camel - Apache Camel 与 Apache Nifi
我使用 Apache camel 已经很长时间了，发现它是满足各种系统集成相关业务需求的绝佳解决方案。但是几年前我遇到了 Apache Nifi 解决方案。经过一番谷歌搜索后，我发现虽然 Nifi 可
apache-flink - Apache Apex 与 Apache Flink
由于两者都是一次处理事件的流框架，这两种技术/流框架之间的核心架构差异是什么？此外，在哪些特定用例中，一个比另一个更合适？最佳答案正如您所提到的，两者都是实时内存计算的流式平台。但是当您仔细观察
apache - apache 文件中使用什么语言？
apache 文件(如 httpd.conf 和虚拟主机)中使用的语言名称是什么，例如 # Ensure that Apache listens on port 80 Listen 80 D
apache - apache 生命周期是怎样的？
作为我学习过程的一部分，我认为如果我扩展更多关于 apache 的知识会很好。我有几个问题，虽然我知道有些内容可能需要相当冗长的解释，但我希望您能提供一个概述，以便我知道去哪里寻找。 (最好引用 mo
apache-kafka - Apache Pulsar 与 Apache RocketMQ
关闭。这个问题是opinion-based .它目前不接受答案。想改善这个问题吗？更新问题，以便可以通过 editing this post 用事实和引文回答问题. 4 个月前关闭。 Improve
apache - (Apache) 错误日志美化器
就目前而言，这个问题不适合我们的问答形式。我们希望答案得到事实、引用或专业知识的支持，但这个问题可能会引起辩论、争论、投票或扩展讨论。如果您觉得这个问题可以改进并可能重新打开，visit the he
apache-kafka - Apache Camel 与 Apache Kafka
这个问题在这里已经有了答案: Difference Between Apache Kafka and Camel (Broker vs Integration) (4 个回答) 3年前关闭。据我所知
apache - Apache 中多个目录的规则相同吗？
我有 2 个使用相同规则的子域，如下所示: RewriteEngine On RewriteCond %{REQUEST_FILENAME} !-f RewriteCond

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

apache-spark - Hive on Spark查询因资源不足而挂起