gpt4 book ai didi

python - 使用 PySpark 从 Amazon S3 读取文本文件

转载 作者:行者123 更新时间:2023-12-01 01:50:46 25 4
gpt4 key购买 nike

我正在尝试让 Spark 集群从 Amazon S3 云存储读取数据源。这会导致以下错误,为此我需要一些帮助来诊断问题:

>>> sc.textFile("s3a://storage-bucket/s3test.txt").collect()

py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: D47397DA8BCB4669, AWS Error Code: null, AWS Error Message: Bad Request, S3 Extended Request ID: /aBi99tozgFEsdRGubDwhriMsNQvl1jLOf8AJquA8VXxzkpPL/LLCWDFQQvYn4snHx5gx66/pXo=

顺便说一句,这工作得很好:

$ aws s3 cp s3://storage-bucket/s3test.txt ./s3text.txt
download: s3://storage-bucket/s3test.txt to ./s3text.txt
$ cat s3text.txt
Hello S3

错误消息中的更多详细信息:

Caused by: org.jets3t.service.S3ServiceException: Service Error Message. -- ResponseCode: 403, ResponseStatus: Forbidden, XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error><Code>SignatureDoe
sNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.</Message><AWSAccessKeyId>xxxxxxxxxxxxxxxxxx</AWSAccessKeyId><St

最佳答案

你能检查一下你的fs.s3a.access.key吗?和fs.s3a.secret.key并确保它们与您用于执行 aws s3 cp 的凭据匹配测试。这个SignatureDosNotMatch当凭据错误时,可能会显示错误。尝试hdfs fs -ls s3a://storage-bucket/

关于python - 使用 PySpark 从 Amazon S3 读取文本文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50741418/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com