gpt4 book ai didi

python - 从 python 字典自动生成 BigQuery 模式

转载 作者:太空宇宙 更新时间:2023-11-04 04:12:35 25 4
gpt4 key购买 nike

如何从 Python 字典自动生成 BigQuery 表架构?

例如

dict = {'data': 'some_data', 'me': 8}
schema = BigQuery.generateSchema(dict)

#schema is now:
# {'fields': [
# {'name': 'data', 'type': 'STRING', 'mode': 'NULLABLE'},
# {'name': 'me', 'type': 'INT', 'mode': 'NULLABLE'}
# ]}

有这样的东西吗?

最佳答案

目前还没有通过 BigQuery Python 库执行此操作的当前方法。

这里有一个递归函数来实现。

import datetime
from google.cloud.bigquery.schema import SchemaField

# [START] map_dict_to_bq_schema

# FieldType Map Dictionary
field_type = {
str: 'STRING',
bytes: 'BYTES',
int: 'INTEGER',
float: 'FLOAT',
bool: 'BOOLEAN',
datetime.datetime: 'DATETIME',
datetime.date: 'DATE',
datetime.time: 'TIME',
dict: 'RECORD',
}


# Function to take a dictionary
# and return a bigquery schema
def map_dict_to_bq_schema(source_dict):

# SchemaField list
schema = []

# Iterate the existing dictionary
for key, value in source_dict.items():

try:
schemaField = SchemaField(key, field_type[type(value)]) # NULLABLE BY DEFAULT
except KeyError:

# We are expecting a REPEATED field
if value and len(value) > 0:
schemaField = SchemaField(key, field_type[type(value[0])], mode='REPEATED') # REPEATED

# Add the field to the list of fields
schema.append(schemaField)

# If it is a STRUCT / RECORD field we start the recursion
if schemaField.field_type == 'RECORD':

schemaField._fields = map_dict_to_bq_schema(value)

# Return the dictionary values
return schema

# [END] map_dict_to_bq_schema

例子:



>>> map_dict_to_bq_schema({'data': 'some_data', 'me': 8})
# Output
>>> [SchemaField('data', 'STRING', 'NULLABLE', None, ()), SchemaField('me', 'INTEGER', 'NULLABLE', None, ())]


>>> map_dict_to_bq_schema({'data': {'data2': 'some_data', 'me2': 8}, 'me': 8, 'h':[5,6,7]})
# Output
>>> [SchemaField('h', 'INTEGER', 'REPEATED', None, ()), SchemaField('me', 'INTEGER', 'NULLABLE', None, ()), SchemaField('data', 'RECORD', 'NULLABLE', None, [SchemaField('data2', 'STRING', 'NULLABLE', None, ()), SchemaField('me2', 'INTEGER', 'NULLABLE', None, ())])]

我在这个问题中使用了@luckylwk 的代码作为引用:How to map a Python Dict to a Big Query Schema ,专门为 nested and repeated列。

此外,检查 SchemaField来自 BQ python 库的类。从那里,您可以获取要与 python 客户端、CLI 或与您的用例相匹配的架构一起使用的格式。

关于python - 从 python 字典自动生成 BigQuery 模式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56079925/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com