gpt4 book ai didi

google-cloud-platform - BigQuery 存储 API : the table has a storage format that is not supported

转载 作者:行者123 更新时间:2023-12-04 05:41:03 32 4
gpt4 key购买 nike

我使用 BQ 文档中的示例通过以下查询将 BQ 表读入 pandas 数据帧:

query_string = """
SELECT
CONCAT(
'https://stackoverflow.com/questions/',
CAST(id as STRING)) as url,
view_count
FROM `bigquery-public-data.stackoverflow.posts_questions`
WHERE tags like '%google-bigquery%'
ORDER BY view_count DESC
"""

dataframe = (
bqclient.query(query_string)
.result()
.to_dataframe(bqstorage_client=bqstorageclient)
)
print(dataframe.head())

url view_count
0 https://stackoverflow.com/questions/22879669 48540
1 https://stackoverflow.com/questions/13530967 45778
2 https://stackoverflow.com/questions/35159967 40458
3 https://stackoverflow.com/questions/10604135 39739
4 https://stackoverflow.com/questions/16609219 34479

但是,当我尝试使用任何其他非公开数据集时,出现以下错误:

google.api_core.exceptions.FailedPrecondition: 400 there was an error creating the session: the table has a storage format that is not supported

是否需要在我的表中设置一些设置,以便它可以与 BQ 存储 API 一起使用?

这个有效:

query_string = """SELECT funding_round_type, count(*) FROM `datadocs-py.datadocs.investments` GROUP BY funding_round_type order by 2 desc LIMIT 2""" 
>>> bqclient.query(query_string).result().to_dataframe()

funding_round_type f0_
0 venture 104157
1 seed 43747

但是,当我将它设置为使用 bqstorageclient 时,我得到了这个错误:

>>> bqclient.query(query_string).result().to_dataframe(bqstorage_client=bqstorageclient)

Traceback (most recent call last):
File "/Users/david/Desktop/V/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 57, in error_remapped_callable
return callable_(*args, **kwargs)
File "/Users/david/Desktop/V/lib/python3.6/site-packages/grpc/_channel.py", line 533, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/Users/david/Desktop/V/lib/python3.6/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.FAILED_PRECONDITION
details = "there was an error creating the session: the table has a storage format that is not supported"
debug_error_string = "{"created":"@1565047973.444089000","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1017,"grpc_message":"there was an error creating the session: the table has a storage format that is not supported","grpc_status":9}"
>

最佳答案

我在 2019 年 11 月 6 日遇到了同样的问题,事实证明,您遇到的错误是 Read API 的一个已知问题,因为它目前无法处理小于 10MB 的结果集。我遇到了这个问题,它阐明了这个问题: GitHub.com - GoogleCloudPlatform/spark-bigquery-connector - FAILED_PRECONDITION: there was an error creating the session: the table has a storage format that is not supported #46

我已经使用返回大于 10MB 结果集的查询对其进行了测试,并且它似乎对我查询的数据集的欧盟多区域位置来说工作正常。

此外,您需要在您的环境中安装 fastavro 才能使用此功能。

关于google-cloud-platform - BigQuery 存储 API : the table has a storage format that is not supported,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57367139/

32 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com