gpt4 book ai didi

azure - 使用 python 缓慢上传到 azure blob 存储

转载 作者:行者123 更新时间:2023-12-02 06:53:44 26 4
gpt4 key购买 nike

Api 接收文件,然后尝试创建唯一的 blob 名称。然后我将 4MB 的 block 上传到 blob。每个 block 大约需要 8 秒,这正常吗?我的上传速度是110Mbps。我尝试上传一个 50MB 的文件,花了将近 2 分钟。我不知道azure_blob_storage版本是否与此相关,我使用的是azure-storage-blob==12.14.1

import uuid
import os
from azure.storage.blob import BlobClient, BlobBlock, BlobServiceClient
import time
import uuid

@catalog_api.route("/catalog", methods=['POST'])
def catalog():
file = request.files['file']

url_bucket, file_name, file_type = upload_to_blob(file)


def upload_to_blob(self, file):
file_name = file.filename
file_type = file.content_type

blob_client = self.generate_blob_client(file_name)
blob_url = self.upload_chunks(blob_client, file)
return blob_url, file_name, file_type


def generate_blob_client(self, file_name: str):
blob_service_client = BlobServiceClient.from_connection_string(self.connection_string)
container_client = blob_service_client.get_container_client(self.container_name)

for _ in range(self.max_blob_name_tries):
blob_name = self.generate_blob_name(file_name)

blob_client = container_client.get_blob_client(blob_name)
if not blob_client.exists():
return blob_client
raise Exception("Couldnt create the blob")


def upload_chunks(self, blob_client: BlobClient, file):
block_list=[]
chunk_size = self.chunk_size
while True:
read_data = file.read(chunk_size)

if not read_data:
print("uploaded")
break

print("uploading")
blk_id = str(uuid.uuid4())
blob_client.stage_block(block_id=blk_id,data=read_data)
block_list.append(BlobBlock(block_id=blk_id))

blob_client.commit_block_list(block_list)

return blob_client.url
```

最佳答案

我在我的环境中进行了尝试并得到了以下结果:

我尝试使用 50 MB 文件将 block 大小为 4*1024*1024 的 Blob 存储帐户从本地环境上传到存储帐户,需要 45 秒。

代码:

import uuid
from azure.storage.blob import BlobBlock, BlobServiceClient
import time


connection_string="<storage account connection string >"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
container_client = blob_service_client.get_container_client('test')
blob_client = container_client.get_blob_client("file.pdf")
start=time.time()
#upload data
block_list=[]
chunk_size=4*1024*1024
with open("C:\\file.pdf",'rb') as f:
while True:
read_data = f.read(chunk_size)
if not read_data:
break # done
blk_id = str(uuid.uuid4())
blob_client.stage_block(block_id=blk_id,data=read_data)
block_list.append(BlobBlock(block_id=blk_id))

blob_client.commit_block_list(block_list)
end=time.time()
print("Time taken to upload blob:", end - start, "secs")

在上面的代码中,我在代码末尾添加了start和end的计时方法,我使用end-start过程来了解blob存储中上传文件的计时。

控制台:

enter image description here

确保您的互联网速度良好,而且我尝试了其他一些互联网速度,最多需要 78 秒。

门户:

enter image description here

关于azure - 使用 python 缓慢上传到 azure blob 存储,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75032295/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com