gpt4 book ai didi

python - 如何防止 AWS EC2 服务器无限期运行?

转载 作者:行者123 更新时间:2023-12-04 03:58:25 25 4
gpt4 key购买 nike

我有一个 Django 应用程序,用户可以通过该应用程序提交视频,以便通过在单独的 EC2 实例上运行 OpenCV 的 python 脚本进行处理。由于这是一台运行成本适中的服务器(p2.Xlarge ~ $3.00/h),它只会在提交视频时启动,我想确保它不会在处理过程中出现问题时继续运行。如果程序运行正常,实例将正确关闭。

问题是有时 python 脚本挂起(我似乎无法自己复制它,这是一个单独的问题),当脚本没有完全执行时,服务器继续无限期地运行。我已经尝试了此处为 self terminating an AWS EC2 instance 提供的解决方案.该解决方案在服务器空闲时有效,但在服务器忙于处理视频时似乎无效。

有没有更好的方法来确保服务器运行时间不会超过 x 分钟并停止它,即使服务器正处于进程中间?

我目前使用的代码:

import paramiko
import boto3
import sys
from botocore.exceptions import ClientError
import json
from time import sleep

import argparse

parser = argparse.ArgumentParser()

parser.add_argument('--username', required=False)
parser.add_argument('--date', required=False)


args = parser.parse_args()

uName = args.username
theDate = args.date


ec2 = boto3.client('ec2', region_name= 'us-east-1', aws_access_key_id=accessKey, aws_secret_access_key=secretKey, )
ec2_2 = boto3.resource('ec2', region_name= 'us-east-1', aws_access_key_id=accessKey, aws_secret_access_key=secretKey, )
client = boto3.client('ses',region_name= 'us-east-1', aws_access_key_id=accessKey, aws_secret_access_key=secretKey,)


s3_resource = boto3.client('s3', region_name= 'us-east-1', aws_access_key_id=accessKey, aws_secret_access_key=secretKey, )

s3_instance = boto3.resource('s3', region_name= 'us-east-1', aws_access_key_id=accessKey, aws_secret_access_key=secretKey, )

obj = s3_instance.Object('my_bucket', 'data/instances.txt')#load file of instances
body=obj.get()['Body'].read().decode('utf-8')


instance_ids.index(body.split()[-1:][0])#get index of last run instance
if instance_ids.index(body.split()[-1:][0]) != 4: #if it isn't the 5th instance run the next instance
instance_id=instance_ids[instance_ids.index(body.split()[-1:][0])+1]
else:
instance_id=instance_ids[0]#if it is the last instance then run the first instance

body+='\n'+instance_id #add the instance run to the end of the file
obj.put(Body=body) #write the file back to S3

while True:
try:
ec2.start_instances(InstanceIds=[instance_id], DryRun=True)
except ClientError as e:
if 'DryRunOperation' not in str(e):
raise
try:
ec2.start_instances(InstanceIds=[instance_id], DryRun=False)
break
except:
continue
#except 'ClientError' as e:
# print(e)
print('instance started')

while True:
if not ec2_2.Instance(instance_id).state['Code']== 16:
print(ec2_2.Instance(instance_id).state)
sleep(2.5)
continue
else:
print('state == running')
break

while True:
try:
instance = ec2_2.Instance(instance_id).public_ip_address
ip_add=instance
break
except:
continue

prevent_bankruptcy = 'echo "sudo halt" | at now + 15 minutes'

move_frome_s3 = 'aws s3 cp s3://my-bucket/media/{0}/Sessions/{1}/Uploads/{2} ./python-scripts/data/'.format(uName,theDate, file)

move_about_file = 'aws s3 cp s3://my-bucket/media/{}/about.txt ./python-scripts/data/results/result-dicts/'.format(uName)

move_assessment_file = 'aws s3 cp s3://my-bucket/media/{}/ranges.txt ./python-scripts/data/results/result-dicts/'.format(uName)

convert_file= 'cd python-scripts && python3 convert_codec.py --username {0} --date {1}'.format(uName, theDate)

key_location = "/my/key/folder/MyKey.pem"

k = paramiko.RSAKey.from_private_key_file(key_location)
c = paramiko.SSHClient()
c.set_missing_host_key_policy(paramiko.AutoAddPolicy())


while True:
try:
c.connect( hostname = ip_add, username = "ubuntu", pkey = k, banner_timeout=60)
break
except:
sleep(1.5)


commands = [prevent_bankruptcy, make_dir, move_frome_s3, move_about_file, convert_file, move_assessment_file, create_folder]


for command in commands:
print ("Executing {}".format( command ))
stdin , stdout, stderr = c.exec_command(command)
errList.append(stderr.read())
print (stdout.read())
print( "Errors")
print ("***",stderr.read())
c.close()

try:
ec2.stop_instances(InstanceIds=[instance_id], DryRun=False)

except ClientError as e:
if 'DryRunOperation' not in str(e):
raise

try:
ec2.stop_instances(InstanceIds=[instance_id], DryRun=False)

except 'ClientError' as e:
print(e)

如果我编辑命令以仅运行调用“sudo echo halt”的 prevent_bankruptcy 并让服务器闲置 15 分钟,它将自动关闭。但是,如果 convert_file 出现问题,它将继续无限期地运行,这可能会导致计费时间出现意外。

最佳答案

您可以使用信号在 python 中使用超时函数,并从上面编写的外部脚本中终止实例。

import signal

def handler(signum, frame):
raise TimeoutError('Timeout')

def loop():
for command in commands:
c.exec_command(command)

signal.signal(signal.SIGALRM, handler)
signal.alarm(60)

try:
loop()
except TimeoutError:
ec2.stop_instances()

signal.alarm(0)

超时代码取自Timeout on a function call 上的最佳答案。

还有来自对同一答案的评论的警告。它仅适用于主 python 线程。您还必须使用我包含的 signal.alarm(0) 将其关闭,它不适用于 C 扩展。

关于python - 如何防止 AWS EC2 服务器无限期运行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63515472/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com