gpt4 book ai didi

hadoop - 无法将数据从Hive加载到ElasticSearch

转载 作者:行者123 更新时间:2023-12-02 21:40:55 25 4
gpt4 key购买 nike

我目前正在尝试将数据从Hive加载到ElasticSearch。我正在使用cloudera CDH 5.3。我已经在我的配置单元路径中添加了hadoop-es配置单元2.0.2 jar。我有ElasticSearch 1.4.4并在10.44.162.169上运行。

我现在有一个名为hive_cdr的表,具有以下属性:

 traffic_type_id (big int)
appelant (int)
called_number (int)
call_duration (int)
location_number (string)
date_heure_appel(string)

我正在尝试在我的配置单元中定义ES表以加载一些数据。
为此,我已经这样做了:
CREATE EXTERNAL TABLE es_hive_cdr (
traffic bigint ,
calling int ,
called int ,
duration int ,
location string ,
date string )
ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe'
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES (
'es.nodes'='10.44.162.169',
'es.resource'='indexCDR/typeCDR'
) ;

但是,我得到了一个异常(exception),说无法识别EsStorage。

我删除了EsStorage行并执行尝试以查找发生了什么情况。

现在尝试将数据从hive_cdr表加载到新表中:
insert into table es_hive_cdr2
select
traffic_type_id,
appelant,
called_number,
call_duration,
location_number,
date_heure_appel
from hive_cdr;

但是它失败了,我得到了这个错误:

处理语句时出错:失败:执行错误,从org.apache.hadoop.hive.ql.exec.mr.MapRedTask返回代码2

阶段依赖:
  Stage-1 is a root stage
Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5
Stage-4
Stage-0 depends on stages: Stage-4, Stage-3, Stage-6
Stage-2 depends on stages: Stage-0
Stage-3
Stage-5
Stage-6 depends on stages: Stage-5

阶段计划:
  Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: hive_cdr
Statistics: Num rows: 267130 Data size: 58768736 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: traffic_type_id (type: bigint), appelant (type: int), called_number (type: int), call_duration (type: int), location_number (type: string), date_heure_appel (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
Statistics: Num rows: 267130 Data size: 58768736 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 267130 Data size: 58768736 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.elasticsearch.hadoop.hive.EsSerDe
name: default.es_hive_cdr2

Stage: Stage-7
Conditional Operator

Stage: Stage-4
Move Operator
files:
hdfs directory: true
destination: hdfs://master:8020/user/hive/warehouse/es_hive_cdr2/.hive-staging_hive_2015-03-02_14-09-08_285_4734041865540737822-2/-ext-10000

Stage: Stage-0
Move Operator
tables:
replace: false
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.elasticsearch.hadoop.hive.EsSerDe
name: default.es_hive_cdr2

Stage: Stage-2
Stats-Aggr Operator

Stage: Stage-3
Map Reduce
Map Operator Tree:
TableScan
File Output Operator
compressed: false
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.elasticsearch.hadoop.hive.EsSerDe
name: default.es_hive_cdr2

Stage: Stage-5
Map Reduce
Map Operator Tree:
TableScan
File Output Operator
compressed: false
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.elasticsearch.hadoop.hive.EsSerDe
name: default.es_hive_cdr2

Stage: Stage-6
Move Operator
files:
hdfs directory: true
destination: hdfs://master:8020/user/hive/warehouse/es_hive_cdr2/.hive-staging_hive_2015-03-02_14-09-08_285_4734041865540737822-2/-ext-10000

我真的需要一些帮助和指导,并感激和感谢您!

最佳答案

尝试提供表属性。

TBLPROPERTIES('es.resource'='myviews / myview','es.nodes'='host-of-es-cluster','es.port'='9200','es.input.json'='false ','es.write.operation'='索引','es.index.auto.create'='是','es.nodes.wan.only'='true');

还要将elasticsearch.yml文件中的属性更改为以下一项

网络主机:_site_

关于hadoop - 无法将数据从Hive加载到ElasticSearch,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28811907/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com