gpt4 book ai didi

python - Sagemaker : How do I set content_type in Predictor (Sagemake > 2. 0)?

转载 作者:行者123 更新时间:2023-12-05 00:55:09 29 4
gpt4 key购买 nike

请求帮助解决以下错误。

An error occurred (ModelError) when calling the InvokeEndpointoperation: Received client error (415) from model with message"Content-type application/octet-stream not supported. Supportedcontent-type is text/csv, text/libsvm"

这里是相关代码-

from sagemaker import image_uris
from sagemaker.estimator import Estimator

xgboost_hyperparameters = {
"max_depth":"5",
"eta":"0.2",
"gamma":"4",
"min_child_weight":"6",
"subsample":"0.7",
"num_round":"50"
}

xgboost_image = image_uris.retrieve("xgboost", boto3.Session().region_name, version="1")



estimator = Estimator(image_uri = xgboost_image,
hyperparameters = xgboost_hyperparameters,
role = role,
instance_count=1,
instance_type='ml.m5.2xlarge',
output_path= output_loc,
volume_size=5 )

from sagemaker.serializers import CSVSerializer
from sagemaker.deserializers import CSVDeserializer

train_input = sagemaker.inputs.TrainingInput(s3_data = train_loc, content_type='text/csv',s3_data_type = 'S3Prefix')
valid_input = sagemaker.inputs.TrainingInput(s3_data = validation_loc, content_type='text/csv',s3_data_type = 'S3Prefix')

estimator.CONTENT_TYPE = 'text/csv'
estimator.serializer = CSVSerializer()
estimator.deserializer = None

estimator.fit({'train':train_input, 'validation': valid_input})

# deploy model with data config
from sagemaker.model_monitor import DataCaptureConfig
from time import gmtime, strftime
s3_capture_upload_path = 's3://{}/{}/monitoring/datacapture'.format(bucket, prefix)
model_name = 'project3--model-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
endpoint_name = 'project3-endpoint'
data_capture_configuration = DataCaptureConfig(
enable_capture = True,
sampling_percentage=100,
destination_s3_uri=s3_capture_upload_path )

deploy = estimator.deploy(initial_instance_count = 1,
instance_type = 'ml.m4.xlarge' ,
data_capture_config=data_capture_configuration,
model_name=model_name,
endpoint_name = endpoint_name
)

然后我面临 Predictor 中的错误

from sagemaker.predictor import Predictor

predictor = Predictor(endpoint_name=endpoint_name)
with open('test.csv', 'r') as f:
for row in f:
print(row)
payload = row.rstrip('\n')
response = predictor.predict(data=payload[2:])
sleep(0.5)
print('done!')

我查看了这些链接,但没有找到答案

  1. https://github.com/aws-samples/reinvent2019-aim362-sagemaker-debugger-model-monitor/blob/master/02_deploy_and_monitor/deploy_and_monitor.ipynb
  2. How can I specify content_type in a training job of XGBoost from Sagemaker in Python?
  3. https://github.com/aws/amazon-sagemaker-examples/issues/729

最佳答案

首先,请确定您使用的是哪个 SDK 版本。 AWS 在 1.x 和 2.x 之间进行了重大更改。更糟糕的是,笔记本实例上的 sagemaker SDK 可能会因地区而异。

请参阅 How to use Serializer and Deserializer in Sagemaker 2以及 AWS 改变了序列化/反序列化的东西。

Behavior for serialization of input data and deserialization of result data can be configured through initializer arguments.

请尝试:

from sagemaker.serializers import CSVSerializer
predictor.serializer = CSVSerializer()

或者通过将 None 设置为序列化程序,您可以完全控制代码中的序列化/反序列化。

predictor.serializer=None

关于python - Sagemaker : How do I set content_type in Predictor (Sagemake > 2. 0)?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65202873/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com