gpt4 book ai didi

hadoop - 当我运行带有多个对象的创建请求时,Hadoop Hive保持卡住

转载 作者:行者123 更新时间:2023-12-02 21:25:38 25 4
gpt4 key购买 nike

当我执行一些简单的表创建时,我的Hive可以工作,但是当我尝试运行带有许多对象的任何创建表时,在为我提供以下内容后,它就会冻结,

Query ID = root_20160321031616_6fbfd536-f3e5-4517-ab8b-2dc8ddb34b85

Total jobs = 3

Launching Job 1 out of 3

Number of reduce tasks is set to 0 since there's no reduce operator

Starting Job = job_1458530057671_0001, Tracking URL = http://sandbox.hortonworks.com:8088/proxy/application_1458530057671_0001/

Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job -kill job_1458530057671_0001

我不记得在工作之前是否存在过“... no reduce操作符”。

我尝试运行的代码相对简单,
create table BMO_F069_table as
select
get_json_object(BMO_F069.json, '$.text') as text,
get_json_object(BMO_F069.json, '$.in_reply_to_user_id') as in_reply_to_user_id,
get_json_object(BMO_F069.json, '$.id') as id,
get_json_object(BMO_F069.json, '$.favorite_count') as favorite_count,
get_json_object(BMO_F069.json, '$.coordinates') as coordinates,
get_json_object(BMO_F069.json, '$.id_str') as id_str,
get_json_object(BMO_F069.json, '$.user.location') as location,
get_json_object(BMO_F069.json, '$.lang') as lang,
get_json_object(BMO_F069.json, '$.indices') as indices,
get_json_object(BMO_F069.json, '$.type') as type,
get_json_object(BMO_F069.json, '$.hashtags') as hashtags,
get_json_object(BMO_F069.json, '$.user_mentions') as user_mentions,
get_json_object(BMO_F069.json, '$.user.screen_name') as screen_name,
get_json_object(BMO_F069.json, '$.user.name') as name,
get_json_object(BMO_F069.json, '$.in_reply_to_screen_name') as in_reply_to_screen_name,
get_json_object(BMO_F069.json, '$.retweet_count') as retweet_count,
get_json_object(BMO_F069.json, '$.favorited') as favorited,
get_json_object(BMO_F069.json, '$.retweeted_status') as retweeted_status,
get_json_object(BMO_F069.json, '$.user') as user,
get_json_object(BMO_F069.json, '$.followers_count') as followers_count,
get_json_object(BMO_F069.json, '$.statuses_count') as statuses_count,
get_json_object(BMO_F069.json, '$.description') as description,
get_json_object(BMO_F069.json, '$.geo_enabled') as geo_enabled,
get_json_object(BMO_F069.json, '$.favourites_count') as favourites_count,
get_json_object(BMO_F069.json, '$.created_at') as created_at,
get_json_object(BMO_F069.json, '$.time_zone') as time_zone,
get_json_object(BMO_F069.json, '$.listed_count') as listed_count,
get_json_object(BMO_F069.json, '$.in_reply_to_user_id_str') as in_reply_to_user_id_str
from BMO_F069;

数据包含60 MB数据。不幸的是,我对群集了解不足,无法为您提供规格。抱歉。但我也非常感谢您的反馈。谢谢,
在过去的几周中,我已经进行了数百次类似的查询,而这些数据只有0.5 TB,没有任何问题。当工作冻结时,它停止处理任何新提交的内容。有什么办法可以重置它?

当我从终端运行Hive时,得到以下开始行。这正常吗?我不记得以前的消息是什么。
16/03/21 21:16:55 WARN conf.HiveConf: HiveConf of name hive.optimize.mapjoin.mapreduce does not exist
16/03/21 21:16:55 WARN conf.HiveConf: HiveConf of name hive.heapsize does not exist
16/03/21 21:16:55 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
16/03/21 21:16:55 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist

任何帮助深表感谢。

最佳答案

当您启动未优化的 super 任务时,无论需要多长时间,Hive仍将尝试完成其任务。

由于您没有提供有关群集规格,数据量和查询的任何有用信息,...我想您的查询写得不好或者您缺少群集资源无法及时完成请求。

关于hadoop - 当我运行带有多个对象的创建请求时,Hadoop Hive保持卡住,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36122714/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com