gpt4 book ai didi

google-bigquery - 将 UNNEST 与 with_query_name(WITH 子句)表中的列一起使用

转载 作者:行者123 更新时间:2023-12-05 08:41:11 25 4
gpt4 key购买 nike

出现以下错误

(100032) Error executing query job. Message: Unrecognized name: nested

nested 是我声明为 WITH 子句的临时表。尝试的代码如下:

WITH nested AS
(
SELECT e.my_id , SPLIT(secondary_ids, '<#>') AS arr_secondary_ids
FROM table_with_delimited_string_column e
WHERE my_id = 1234
)
SELECT DISTINCT a
FROM UNNEST(nested.arr_secondary_ids) a

SPLIT 函数将返回一个 ARRAY 类型,稍后将被 UNNESTed。

根据 Google Cloud 文档,这是可行的:

SELECT *
FROM UNNEST(ARRAY<STRUCT<x INT64, y STRING>>[(1, 'foo'), (3, 'bar')]);

这也有效:

WITH sequences AS
(SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id, flattened_numbers
FROM sequences
CROSS JOIN UNNEST(sequences.some_numbers) AS flattened_numbers;

因此,从技术上讲,UNNEST 生成的值表应该能够直接从 SELECT * 中查询。此外,临时表中的列应该能够被取消嵌套。

但是,当我回到我的用例时,...UNNEST(nested.arr_secondary_ids) 将产生上述错误。

我希望能够立即查询它,因为我在结果表中得到了重复值,如您所见,我想使用 DISTINCT 消除这些值。当前的解决方法是什么?产生此错误的技术原因是什么?想知道它是否与我遗漏的 ARRAY 或 STRUCT 类型有关...

最佳答案

以前的答案对理解为什么需要 nested 有很大帮助。但是,我找到了最佳方案,用于确定主键(又名 my_id)何时对定义重复项很重要,何时不重要。

在我的真实场景中,my_id 很重要,我不想要重复的 my_idsecondary_id 对。无法避免 CROSS JOIN 或多表引用,因为源 CTE 表必须直接位于 FROM 子句中(如 @pruthvi-kumar@mikhail-berlyant 所述)。 @mikhail-berlyant还指出这只是一个展平操作,所以 CROSS JOIN 在这里也不是一个昂贵的操作。总而言之,解决方案是:

WITH nested AS
(
SELECT e.my_id , SPLIT(secondary_ids, '<#>') AS arr_secondary_ids
FROM table_with_delimited_string_column e
)
SELECT DISTINCT nested.entity_id, a
FROM nested CROSS JOIN UNNEST(arr_secondary_ids) a

但是,在发布的问题中,我声明了一个固定的 my_id = 1234,因此该列不会成为重复项的决定因素。在这种情况下,可以通过对要取消嵌套的数组使用标量子查询来跳过相关的交叉连接。这里的最佳解决方案是:

WITH nested AS
(
SELECT e.my_id , SPLIT(secondary_ids, '<#>') AS arr_secondary_ids
FROM table_with_delimited_string_column e
WHERE my_id = 1234
)
SELECT DISTINCT a
FROM UNNEST((SELECT arr_secondary_ids FROM nested)) AS a

请注意将 SELECT 包裹在 unnest 中的括号。它们是必需的,否则您会收到如下消息:

The argument to UNNEST is an expression, not a query; to use a query as an expression, the query must be wrapped with additional parentheses to make it a scalar subquery expression

nested 也应该在 my_id 过滤之后只有 1 行,否则你会发现一个讨厌的

Scalar subquery produced more than one element

关于google-bigquery - 将 UNNEST 与 with_query_name(WITH 子句)表中的列一起使用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51646321/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com