gpt4 book ai didi

hadoop - 将文本文件放入配置单元数据库

转载 作者:行者123 更新时间:2023-12-02 22:03:38 29 4
gpt4 key购买 nike

我试图运行这段代码很长时间,有人可以告诉我这是什么问题
代码:-

CREATE EXTERNAL TABLE samp_log 
(
ip String ,col1 String ,col2 String , date String , time_hour int ,time_min int
,time_sec int ,zone int , request String , request_con String , resp_code int
,resp_byte BIGINT , reference String , ext_reference String , col13 String
,col14 String ,col15 String , col16 String ,col17 String
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES ("field.delim"=" ,[,]")
STORED AS TEXTFILE

error - Driver returned: 1. Errors: OK FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe



我还添加了 hive 的jar文件。

最佳答案

使用RegexSerDe

https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ApacheWeblogData

这是一个POC:

create external table mytable  
(
ip string
,dt string
,tm string
,tz string
)
row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
with serdeproperties
(
'input.regex' = '^(.*?) - - \\[(.*?):(.*?) (.*?)\\].*$'
)
location '/tmp/mytable'
;
select * from mytable
;
+-----------------+-------------+------------+------------+
| mytable.ip | mytable.dt | mytable.tm | mytable.tz |
+-----------------+-------------+------------+------------+
| 123.123.123.123 | 26/Apr/2000 | 00:23:48 | -0400 |
+-----------------+-------------+------------+------------+

关于hadoop - 将文本文件放入配置单元数据库,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42679126/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com