gpt4 book ai didi

google-bigquery - 将 BigQuery 嵌套字段内容展平为新列而不是行

转载 作者:行者123 更新时间:2023-12-04 11:21:22 26 4
gpt4 key购买 nike

我有一些以下格式的 BigQuery 数据:

"thing": [
{
"name": "gameLost",
"params": [
{
"key": "total_games",
"val": {
"str_val": "3",
"int_val": null
}
},
{
"key": "games_won",
"val": {
"str_val": "2",
"int_val": null
}
},
{
"key": "game_time",
"val": {
"str_val": "44",
"int_val": null
}
}
],
"dt_a": "1470625311138000",
"dt_b": "1470620345566000"
}

我知道 FLATTEN() 函数会产生 3 行的输出,如下所示:

+------------+------------------+------------------+--------------------+--------------------------+--------------------------+
| thing.name | thing.dt_a | event_dim.dt_b | thing.params.key | thing.params.val.str_val | thing.params.val.int_val |
+------------+------------------+------------------+--------------------+--------------------------+--------------------------+
| gameLost | 1470625311138000 | 1470620345566000 | total_games_played | 3 | null |
| | | | | | |
| gameLost | 1470625311138000 | 1470620345566000 | games_won | 2 | null |
| | | | | | |
| gameLost | 1470625311138000 | 1470620345566000 | game_time | 44 | null |
+------------+------------------+------------------+--------------------+--------------------------+--------------------------+

对于每个更深层次的对象,更高层次的键/值被重复到新的行中。

但是,我需要将更深的键/值输出为全新的列,而不是重复字段,因此结果如下所示:

+------------+------------------+------------------+--------------------+-----------+-----------+
| thing.name | thing.dt_a | event_dim.dt_b | total_games_played | games_won | game_time |
+------------+------------------+------------------+--------------------+-----------+-----------+
| gameLost | 1470625311138000 | 1470620345566000 | 3 | 2 | 44 |
+------------+------------------+------------------+--------------------+-----------+-----------+

我该怎么做?
谢谢!

最佳答案

Standard SQL使其更易于表达(取消选中“显示选项”下的“使用旧版 SQL”):

WITH T AS (
SELECT STRUCT(
"gameLost" AS name,
ARRAY<STRUCT<key STRING, val STRUCT<str_val STRING, int_val INT64>>>[
STRUCT("total_games", STRUCT("3", NULL)),
STRUCT("games_won", STRUCT("2", NULL)),
STRUCT("game_time", STRUCT("44", NULL))] AS params,
1470625311138000 AS dt_a,
1470620345566000 AS dt_b) AS thing
)
SELECT
(SELECT AS STRUCT thing.* EXCEPT (params)) AS thing,
thing.params[OFFSET(0)].val.str_val AS total_games_played,
thing.params[OFFSET(1)].val.str_val AS games_won,
thing.params[OFFSET(2)].val.str_val AS game_time
FROM T;
+-------------------------------------------------------------------------+--------------------+-----------+-----------+
| thing | total_games_played | games_won | game_time |
+-------------------------------------------------------------------------+--------------------+-----------+-----------+
| {"name":"gameLost","dt_a":"1470625311138000","dt_b":"1470620345566000"} | 3 | 2 | 44 |
+-------------------------------------------------------------------------+--------------------+-----------+-----------+

如果不知道键在数组中的顺序,可以使用subselects来提取相关值:

WITH T AS (
SELECT STRUCT(
"gameLost" AS name,
ARRAY<STRUCT<key STRING, val STRUCT<str_val STRING, int_val INT64>>>[
STRUCT("total_games", STRUCT("3", NULL)),
STRUCT("games_won", STRUCT("2", NULL)),
STRUCT("game_time", STRUCT("44", NULL))] AS params,
1470625311138000 AS dt_a,
1470620345566000 AS dt_b) AS thing
)
SELECT
(SELECT AS STRUCT thing.* EXCEPT (params)) AS thing,
(SELECT val.str_val FROM UNNEST(thing.params) WHERE key = "total_games") AS total_games_played,
(SELECT val.str_val FROM UNNEST(thing.params) WHERE key = "games_won") AS games_won,
(SELECT val.str_val FROM UNNEST(thing.params) WHERE key = "game_time") AS game_time
FROM T;

关于google-bigquery - 将 BigQuery 嵌套字段内容展平为新列而不是行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38839559/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com