gpt4 book ai didi

python - 将 excel 文件从 S3 读取到 Pandas DataFrame 中

转载 作者:行者123 更新时间:2023-12-01 01:17:28 24 4
gpt4 key购买 nike

我有一个 SNS 通知设置,当 .xlsx 文件上传到 S3 存储桶时,该通知会触发 Lambda 函数。

lambda 函数将 .xlsx 文件读取到 Pandas DataFrame 中。

import os 
import pandas as pd
import json
import xlrd
import boto3

def main(event, context):
message = event['Records'][0]['Sns']['Message']
parsed_message = json.loads(message)
src_bucket = parsed_message['Records'][0]['s3']['bucket']['name']
filepath = parsed_message['Records'][0]['s3']['object']['key']

s3 = boto3.resource('s3')
s3_client = boto3.client('s3')

obj = s3_client.get_object(Bucket=src_bucket, Key=filepath)
print(obj['Body'])

df = pd.read_excel(obj, header=2)
print(df.head(2))

我收到如下错误:

Invalid file path or buffer object type: <type 'dict'>: ValueError
Traceback (most recent call last):
File "/var/task/handler.py", line 26, in main
df = pd.read_excel(obj, header=2)
File "/var/task/pandas/util/_decorators.py", line 178, in wrapper
return func(*args, **kwargs)
File "/var/task/pandas/util/_decorators.py", line 178, in wrapper
return func(*args, **kwargs)
File "/var/task/pandas/io/excel.py", line 307, in read_excel
io = ExcelFile(io, engine=engine)
File "/var/task/pandas/io/excel.py", line 376, in __init__
io, _, _, _ = get_filepath_or_buffer(self._io)
File "/var/task/pandas/io/common.py", line 218, in get_filepath_or_buffer
raise ValueError(msg.format(_type=type(filepath_or_buffer)))
ValueError: Invalid file path or buffer object type: <type 'dict'>

我该如何解决这个问题?

最佳答案

这很正常! obj 是一本字典,你尝试过吗?

df = pd.read_excel(obj['body'], header=2)

关于python - 将 excel 文件从 S3 读取到 Pandas DataFrame 中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54185530/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com