gpt4 book ai didi

python - Azure SDK for Python批量读取blob数据

转载 作者:行者123 更新时间:2023-12-03 06:38:08 24 4
gpt4 key购买 nike

我正在从大小约为 5 GB 的 blob 中读取数据。我通常处理大小为 500 mb 的数据。因此,我尝试在多次迭代中以较小的 block 读取数据,例如 300 mb。有没有一种方法可以完成此任务,而不是执行 readall() 而是以较小的增量读取数据?

blob_client = BlobClient(blob_service_client.url,
container_name,
blob_name,
credential)

data_stream = blob_client.download_blob()
data = data_stream.readall()

我将如何使用下面的 chunks与上面的 BlobServiceClient

# This returns a StorageStreamDownloader.
stream = source_blob_client.download_blob()
block_list = []

# Read data in chunks to avoid loading all into memory at once
for chunk in stream.chunks():
# process your data (anything can be done here really. `chunk` is a byte array).
block_id = str(uuid.uuid4())
destination_blob_client.stage_block(block_id=block_id, data=chunk)
block_list.append(BlobBlock(block_id=block_id))

最佳答案

我在我的环境中进行了尝试并得到了以下结果:

How would I use the belowchunks with the above BlobServiceClient

代码:

from  azure.storage.blob  import  BlobServiceClient, BlobBlock

import uuid

connection_string="storage connection string"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
container_client = blob_service_client.get_container_client('test1')
blob_client = container_client.get_blob_client("file.pdf")
#upload data
block_list=[]
chunk_size=4*1024*1024
with open("C:\\Users\\****\\****\\sample12 (2).pdf",'rb') as f:
while True:
read_data = f.read(chunk_size)
if not read_data:
break # done
blk_id = str(uuid.uuid4())
blob_client.stage_block(block_id=blk_id,data=read_data)
block_list.append(BlobBlock(block_id=blk_id))

blob_client.commit_block_list(block_list)

要上传每个 block ,您可以使用 BlobClient.stage_block 方法。上传后,我们使用 BlobClient.commit_block_list 方法将所有 block 合并为一个 blob。

控制台:

enter image description here

门户:

enter image description here

您还可以在两个容器之间引用另一种方法来获取 block SO-thread作者:Jim Xu。

关于python - Azure SDK for Python批量读取blob数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74723660/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com