gpt4 book ai didi

sql - 将列的值分配给sql中的变量使用jinja模板语言

转载 作者:行者123 更新时间:2023-12-05 03:35:55 25 4
gpt4 key购买 nike

我有一个像这样的 sql 文件来转换一个表,其中有一列包含一个 json 字符串

{{ config(materialized='table') }}

with customer_orders as (
select
time,
data as jsonData,

{% set my_dict = fromjson( jsonData ) %}
{% do log("Printout: " ~ my_dict, info=true) %}

from `warehouses.raw_data.customer_orders`

limit 5
)

select *
from customer_orders

当我运行 dbt run 时,它返回如下:

Running with dbt=0.21.0
Encountered an error:
the JSON object must be str, bytes or bytearray, not Undefined

我什至无法打印出我想要的列的值:


{{ config(materialized='table') }}

with customer_orders as (
select
time,
tag,
data as jsonData,

{% do log("Printout: " ~ data, info=true) %}

from `warehouses.raw_data.customer_orders`

limit 5
)

select *
from customer_orders
22:42:58 | Concurrency: 1 threads (target='dev')
22:42:58 |
Printout:
22:42:58 | Done.

但是如果我创建另一个模型来打印出 jsonData 的值:

{%- set payment_methods = dbt_utils.get_column_values(
table=ref('customer_orders_model'),
column='jsonData'
) -%}

{% do log(payment_methods, info=true) %}

{% for json in payment_methods %}
{% set my_dict = fromjson(json) %}
{% do log(my_dict, info=true) %}
{% endfor %}

打印出我想要的json值

Running with dbt=0.21.0
This is log

Found 2 models, 0 tests, 0 snapshots, 0 analyses, 372 macros, 0 operations, 0 seed files, 0 sources, 0 exposures

21:41:15 | Concurrency: 1 threads (target='dev')
21:41:15 |

['{"log": "ok", "path": "/var/log/containers/...log", "time": "2021-10-26T08:50:52.412932061Z", "offset": 527, "stream": "stdout", "@timestamp": 1635238252.412932}']

{'log': 'ok', 'path': '/var/log/containers/...log', 'time': '2021-10-26T08:50:52.412932061Z', 'offset': 527, 'stream': 'stdout', '@timestamp': 1635238252.412932}

21:41:21 | Done.

但我想在像上面的 customer_orders_model 这样的模型文件中处理这个 jsonData。

我如何获取列的值并将其分配给变量并继续处理我想要的任何东西(检查 json 中是否有我想要的键并将其值设置为新列)。

注意:我的目的是:在我的表中,有一个json字符串列,我想将这个json字符串列提取到许多列中,以便我可以轻松地编写我想要的sql查询。

最佳答案

如果是 BigQuery 数据库,Google 有一个 JSON functions in Standard SQL

如果您的列是 JSON 字符串,我认为您可以使用 JSON_EXTRACT 来获取所需键的值

例如:

with customer_orders as (
select
time,
tag,
data as jsonData,
json_extract(data, '$.log') AS log,
from `dc-warehouses.raw_data.logs_trackfoe_prod`
limit 5
)
select *
from customer_orders

关于sql - 将列的值分配给sql中的变量使用jinja模板语言,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69757031/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com