gpt4 book ai didi

presto - 在 Athena/presto 中取消嵌套映射值作为单独的列

转载 作者:行者123 更新时间:2023-12-02 02:32:45 26 4
gpt4 key购买 nike

我的问题有点类似于这个( Athena/Presto - UNNEST MAP to columns )。但就我而言,我事先知道我需要哪些列。

我的用例是这样的

我有一个 json blob,其中包含以下结构

{
"reqId" : "1234",
"clientId" : "client",
"response" : [
{
"name" : "Susan",
"projects" : [
{
"name" : "project1",
"completed" : true
},
{
"name" : "project2",
"completed" : false
}
]
},
{
"name" : "Adams",
"projects" : [
{
"name" : "project1",
"completed" : true
},
{
"name" : "project2",
"completed" : false
}
]
}
]
}

我需要创建一个 View ,它将返回类似这样的输出

    name  |  project    |  Completed |
----------+-------------+------------+
Susan | project1 | true |
Susan | project2 | false |
Adams | project1 | true |
Adams | project2 | false |

我尝试了以下方法和其他方法。这是我能得到的最接近的

WITH dataset AS (
SELECT 'Susan' as name, transform(filter(CAST(json_extract('{
"projects": [{"name":"project1", "completed":false}, {"name":"project3", "completed":false},
{"name":"project2", "completed":true}]}', '$.projects') AS ARRAY<MAP<VARCHAR, VARCHAR>>), p -> (p['name'] != 'project1')), p -> ROW(map_values(p))) AS projects
)
SELECT * from dataset
CROSS JOIN UNNEST(projects)

这是我得到的输出


name projects _col2
1 Susan [{field0=[project3, false]}, {field0=[project2, true]}] {field0=[project3, false]}
2 Susan [{field0=[project3, false]}, {field0=[project2, true]}] {field0=[project2, true]}

我基本上想将 map 的键值对解除嵌套为单独的列。我如何在 presto/Athena 中执行此操作?

最佳答案

您的 JSON 示例似乎无效,它缺少 ,之后"name" : "Susan""name" : "Adams" 。除此之外,您可以通过此查询获得预期的输出,您需要 UNNEST 两次,并且还需要一些转换:

with dataset as
(
select json_parse('{"reqId" : "1234","clientId" : "client","response" : [{"name" : "Susan","projects" : [{"name" : "project1","completed" : true},{"name" : "project2","completed" : false}]},{"name" : "Adams","projects" : [{"name" : "project1","completed" : true},{"name" : "project2","completed" : false}]}]}') as json_col
)
,unnest_response as
(
select *
from dataset
cross join UNNEST(cast(json_extract(json_col, '$.response') as array<JSON>)) as t (response)
)
select
json_extract_scalar(response, '$.name') name,
json_extract_scalar(project, '$.name') project_name,
json_extract_scalar(project, '$.completed') project_completed
from unnest_response
cross join UNNEST(cast(json_extract(response, '$.projects') as array<JSON>)) as t (project);

关于presto - 在 Athena/presto 中取消嵌套映射值作为单独的列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64798108/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com