gpt4 book ai didi

json - 如何从json数据创建一个hive表

转载 作者:行者123 更新时间:2023-12-02 18:34:26 25 4
gpt4 key购买 nike

我见过一些表对一个数据表使用一行 json 代码。下面的 json 不是那种形式。相反,整个文件是一个 json 文件,它基本上是一个字典字典。有谁知道如何使这项工作?

json

{
"aa": {
"a": "A",
"b": "B",
"c": "C",
"d": [
{
"d_1": "D-1",
"d_2": "D-2"
}
],
"e": "E"
},
"bb": {
"a": "AA",
"b": "BB",
"c": "CC",
"d": [
{
"d_1": "DD-11",
"d_2": "DD-22"
}
],
"e": "EE"
}
}

hive 表
drop table if exists test_json_letters;
create table test_json_letters
(
my_array ARRAY<struct<
a:string,
b:string,
c:string,
d:array<struct<
d_1:string,
d_2:string
>>
e:string
>>
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 'path/to/table';

输出应该是
aa,A,B,C,D-1,D-2
bb,AA,BB,CC,DD-11,DD-22

或者
A,B,C,D-1,D-2
AA,BB,CC,DD-11,DD-22

无论哪种方式对我来说都很好。另外,我不是 100% 确定 D 部分的外观。我也可以接受 [{d_1": "D-1","d_2": "D-2"}]而不是我上面的。我对 json 和 hive 表完全陌生,所以我很灵活。

错误是 Display all 560 possibilities? (y or n)在`a:string 之后,所以有一些格式问题,但我不确定它们是什么。

最佳答案

我会尝试如下

你的数据

{"aa": {"a": "A","b": "B","c": "C","d": [{"d_1": "D-1","d_2": "D-2"}],"e": "E"},"bb": {"a": "AA","b": "BB","c": "CC","d": [{"d_1": "DD-11","d_2": "DD-22"}],"e": "EE"}}

创建表语句
CREATE TABLE my_table(aa struct<
a:string,
b:string,
c:string,
d:array<struct<
d_1:string,
d_2:string>>,
e:string>)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS TEXTFILE
LOCATION '/user/training/json';

查询表
SELECT * FROM my_table;
+----------------------------------------------------+--+
| my_table.aa |
+----------------------------------------------------+--+
| {"a":"A","b":"B","c":"C","d":[{"d_1":"D-1","d_2":"D-2"}],"e":"E"} |
+----------------------------------------------------+--+
SELECT aa.a FROM my_table;
+----+--+
| a |
+----+--+
| A |
+----+--+
SELECT aa.d FROM my_table;
+------------------------------+--+
| d |
+------------------------------+--+
| [{"d_1":"D-1","d_2":"D-2"}] |
+------------------------------+--+
SELECT aa.d.d_1 FROM my_table;
+----------+--+
| d_1 |
+----------+--+
| ["D-1"] |
+----------+--+

另一个具有更复杂结构的好例子是

数据:file2.json
{ "purchaseid": { "ticketnumber": "23546852222", "location": "vizag", "Travelerhistory": { "trav": { "fname": "ramu", "lname": "gogi", "travelingarea": { "destination": { "stationid": "KAJKL", "stationname": "hyd" } }, "food": { "foodpref": [{ "foodcode": "CK567", "foodcodeSegment": "NOVEG" }, { "foodcode": "MM98", "foodcodeSegment": "VEG" } ] } } } } }

创建表语句
CREATE TABLE my_table(
purchaseid STRUCT<ticketnumber:STRING,location:STRING,
Travelerhistory:STRUCT<
trav:STRUCT<fname:STRING,lname:STRING,
travelingarea:STRUCT< destination :STRUCT<stationid:string,stationname:string>>,
food :STRUCT<foodpref:ARRAY<STRUCT<foodcode:string,foodcodeSegment:string>>>
>>>)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION '/user/training/json2/';

查询表
select purchaseid.ticketnumber from my_table;
select purchaseid.travelerhistory.trav.fname from my_table;
select purchaseid.travelerhistory.trav.lname from my_table;

and so on

关于json - 如何从json数据创建一个hive表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61687816/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com