gpt4 book ai didi

sql - 在大查询中拆分子字符串并为每个子字符串创建新列

转载 作者:行者123 更新时间:2023-12-05 02:18:51 25 4
gpt4 key购买 nike

我想将一个以空格分隔的字符串分成 5 个并为每个创建列,但我发现很难生成所需的输出。编辑:使用标准 SQL 方言

示例数据:

Row published_at                data_string          device id 
1 2016-10-26T22:53:03.209Z 70.77 3.38 61.65 7.98 73.20 3.29 63.55 nan nan nan nan 2a0025000351353337353037
...
1 of 570 rows

期望的输出:

Row published_at                battery temp1  humid1 temp2  humid2 temp3 humid3 device_id   
1 2016-11-03T16:24:09.833Z 70.77 3.38 61.65 7.98 73.20 3.29 63.55 2a0025000351353337353037
1 of 570 rows

尝试查询 1.a:

WITH
h2a0025_2 AS (
SELECT
TIMESTAMP '2016-10-26T22:53:03.209Z' AS published_at,
'70.77 3.38 61.65 7.98 73.20 3.29 63.55 nan nan nan nan' AS data_string,
'2a0025000351353337353037' AS device_id
UNION ALL
SELECT
TIMESTAMP '2016-10-26T22:53:03.209Z',
'70.77 3.38 61.65 7.98 73.20 3.29 63.55 nan nan nan nan',
'2a0025000351353337353037' )
SELECT
published_at,
parts[OFFSET(0)] AS Battery,
parts[OFFSET(1)] AS Temp1,
parts[OFFSET(1)] AS Humid1,
parts[OFFSET(2)] AS Temp2,
parts[OFFSET(3)] AS Humid2,
parts[OFFSET(4)] AS Temp3,
parts[OFFSET(5)] AS Humid3,
device_id
FROM (
SELECT
* EXCEPT(data_string),
SPLIT(data_string, ' ') AS parts
FROM
`h2a0025_2`);

结果 1.a:2 行相同

  Row   published_at                battery temp1  humid1 temp2  humid2 temp3 humid3 device_id   
1 2016-11-03T16:24:09.833Z 70.77 3.38 61.65 7.98 73.20 3.29 63.55 2a0025000351353337353037
2 2016-11-03T16:24:09.833Z 70.77 3.38 61.65 7.98 73.20 3.29 63.55 2a0025000351353337353037
2 of 2 rows

尝试 2:

 SELECT
published_at,
parts[OFFSET(0)] AS Battery,
parts[OFFSET(1)] AS Temp1,
parts[OFFSET(1)] AS Humid1,
parts[OFFSET(2)] AS Temp2,
parts[OFFSET(3)] AS Humid2,
parts[OFFSET(4)] AS Temp3,
parts[OFFSET(5)] AS Humid3,
device_id
FROM (
SELECT
* EXCEPT(data_string),
SPLIT(data_string, ' ') AS parts
FROM
`myproject.mydataset.h2a0025_2`);

结果: 查询失败 错误:数组索引 3 越界(溢出)

最佳答案

这是一个帮助您入门的示例。不要尝试获取正确的子字符串位置,而是使用 SPLIT 函数,然后在结果数组中挑选出您想要的偏移量。

#standardSQL
WITH YourTable AS (
SELECT
TIMESTAMP '2016-11-03T16:24:09.833Z' AS published_at,
'80.91 22.15 45.35 14.41 64.54' AS data_string
UNION ALL
SELECT
TIMESTAMP '2016-11-04T18:34:08.143Z',
'75.37 28.43 31.17 34.80 19.33'
)
SELECT
published_at,
parts[OFFSET(0)] AS Temp1,
parts[OFFSET(1)] AS Humid1,
parts[OFFSET(2)] AS Temp2,
parts[OFFSET(3)] AS Humid2
FROM (
SELECT
* EXCEPT(data_string),
SPLIT(data_string, ' ') AS parts
FROM YourTable
);

要用您的真实表进行测试 - 仅使用脚本的以下部分 -

#standardSQL
SELECT
published_at,
parts[OFFSET(0)] AS Temp1,
parts[OFFSET(1)] AS Humid1,
parts[OFFSET(2)] AS Temp2,
parts[OFFSET(3)] AS Humid2
FROM (
SELECT
* EXCEPT(data_string),
SPLIT(data_string, ' ') AS parts
FROM `yourproject.yourdataset.yourtable`
);

关于sql - 在大查询中拆分子字符串并为每个子字符串创建新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44056274/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com