gpt4 book ai didi

google-app-engine - 应用引擎实例立即死亡,锁定延迟任务,直到它们达到 10 分钟超时

转载 作者:行者123 更新时间:2023-12-04 17:50:30 25 4
gpt4 key购买 nike

我有一个接收一批 10k-20k 记录的端点。它返回一个作业 ID 并启动延迟任务以并行处理这些任务。似乎有时一个新实例会抓取一些任务,但实际上并没有处理它们。似乎个体立即死亡。

最终这些任务达到了 10 分钟的超时时间并再次启动。

如果我找到这些任务之一并按运行它的实例的 ID 进行过滤,这就是我在 Google 日志查看器中看到的内容:

logs

大多数日志条目只有这条消息“由于在加载请求期间超过了请求截止日期,进程已终止。”消息的时间戳比请求的时间戳晚 10 分钟。

一个有这个堆栈跟踪:

Traceback (most recent call last):
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 351, in __getattr__
self._update_configs()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 283, in _update_configs
self._lock.acquire()
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/threading.py", line 170, in acquire
self.__count = self.__count + 1
DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded.

另一个有这个:

(/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py:252)
Traceback (most recent call last):
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 351, in __getattr__
self._update_configs()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 287, in _update_configs
self._registry.initialize()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 160, in initialize
import_func(self._modname)
File "/base/data/home/apps/s~myappid/dev.403063962077465992/appengine_config.py", line 12, in <module>
vendor.add('lib')
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/vendor/__init__.py", line 40, in add
elif os.path.isdir(path):
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/genericpath.py", line 52, in isdir
return stat.S_ISDIR(st.st_mode)
DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded.

另一个有这个:

(/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py:252)
Traceback (most recent call last):
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 240, in Handle
handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 351, in __getattr__
self._update_configs()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 287, in _update_configs
self._registry.initialize()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/lib_config.py", line 160, in initialize
import_func(self._modname)
File "/base/data/home/apps/s~myappid/dev.403063962077465992/appengine_config.py", line 14, in <module>
from lib import requests
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/__init__.py", line 52, in <module>
from .packages.urllib3.contrib import pyopenssl
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/__init__.py", line 27, in <module>
from . import urllib3
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/__init__.py", line 8, in <module>
from .connectionpool import (
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/connectionpool.py", line 29, in <module>
from .connection import (
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/connection.py", line 39, in <module>
from .util.ssl_ import (
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/util/__init__.py", line 3, in <module>
from .connection import is_connection_dropped
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/util/connection.py", line 145, in <module>
HAS_IPV6 = _has_ipv6('::1')
File "/base/data/home/apps/s~myappid/dev.403063962077465992/lib/requests/packages/urllib3/util/connection.py", line 135, in _has_ipv6
sock.bind((host, 0))
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/socket.py", line 227, in meth
return getattr(self._sock,name)(*args)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/remote_socket/_remote_socket.py", line 663, in bind
self._CreateSocket(bind_address=address)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/remote_socket/_remote_socket.py", line 609, in _CreateSocket
'remote_socket', 'CreateSocket', request, reply)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 95, in MakeSyncCall
return stubmap.MakeSyncCall(service, call, request, response)
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 329, in MakeSyncCall
rpc.CheckSuccess()
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/apiproxy_rpc.py", line 133, in CheckSuccess
elif self.exception:
File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/apiproxy_rpc.py", line 136, in exception
@property
DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded.
This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application.

主要问题是我需要在 5-10 分钟内完成批处理。

批处理中的每条记录应该只需要一分钟的时间来处理,因此解决方案是修改 10 分钟的超时,但 Google 支持人员表示这是不可能的。

我尝试实现 warmpup 请求来尝试解决加载请求,但这似乎没有任何影响。

最佳答案

在问题再次出现之前,我之前的回答在光荣的 24 小时内有效。感觉python请求从2.12.3升级到2.18.2后问题变少了,但很难说。

不管最终起作用的解决方案是编辑 urllib3 的源代码(现在已经 5 天了,问题还没有发生)。

在文件 urllib3/util/connection.py 中,对 _has_ipv6(host) 进行硬编码以始终返回 false(无论如何它始终在 App Engine 中返回)

def _has_ipv6(host):
""" Returns True if the system can bind an IPv6 address. """
+ return False
sock = None
has_ipv6 = False

if socket.has_ipv6:
# has_ipv6 returns true if cPython was compiled with IPv6 support.
# It does not tell us if the system has IPv6 support enabled. To
# determine that we must bind to an IPv6 address.
# https://github.com/shazow/urllib3/pull/611
# https://bugs.python.org/issue658327
try:
sock = socket.socket(socket.AF_INET6)
sock.bind((host, 0))
has_ipv6 = True
except Exception:
pass

if sock:
sock.close()
return has_ipv6


HAS_IPV6 = _has_ipv6('::1')

我相信问题最终是对 sock.bind((host, 0)) 的调用挂起,根据 google 文档 https://cloud.google.com/appengine/docs/standard/python/sockets/

You cannot bind to specific IP addresses or ports.

我能够创建一个独立的 GAE 示例项目来展示问题的案例。安装这些第三方库 pip install requests requests_toolbelt -t lib 到名为 lib 的文件夹并创建这些文件:

应用程序.yaml:

application: #your project id here
version: dev
runtime: python27
api_version: 1
threadsafe: true
inbound_services:
- warmup

automatic_scaling:
min_idle_instances: 0
max_concurrent_requests: 8 # default value

env_variables:
GAE_USE_SOCKETS_HTTPLIB : 'true'

builtins:
- appstats: on #/_ah/stats/
- remote_api: on #/_ah/remote_api/
- deferred: on

handlers:
- url: /.*
script: main.app

libraries:
- name: jinja2
version: "2.6"
- name: webapp2
version: "2.5.2"
- name: markupsafe
version: "0.15"
- name: ssl
version: "2.7.11"
- name: pycrypto
version: "2.6"
- name: lxml
version: latest

主要.py:

import webapp2
import requests
import time

from google.appengine.api.taskqueue import taskqueue
from google.appengine.api import app_identity
from google.appengine.ext import deferred, ndb


class MainHandler(webapp2.RequestHandler):
def get(self):
self.response.write('''<form method="POST"><input type="submit" value="Launch"></form>''')

def post(self):
queue = taskqueue.Queue()
futures = [queue.add_async(taskqueue.Task(url="/task")) for _ in xrange(0, 2000)]
ndb.Future.wait_all(futures)
print 'launched'
self.get()


class TaskHandler(webapp2.RequestHandler):
def post(self):
try:
r = requests.post("https://"+app_identity.get_application_id()+".appspot.com/post")
print r.text
except Exception as e:
print str(e)


class RequestFromTaskHandler(webapp2.RequestHandler):
def post(self):
time.sleep(2)
self.response.write('responded')


app = webapp2.WSGIApplication([
('/', MainHandler),
('/_ah/warmup', MainHandler),
('/task', TaskHandler),
('/post', RequestFromTaskHandler),
], debug=True)

appengine_config.py:

import os

from google.appengine.ext import vendor

vendor.add('lib')

import requests
from requests_toolbelt.adapters import appengine as requests_toolbelt_appengine

# Use the App Engine Requests adapter. This makes sure that Requests uses
# URLFetch.
requests_toolbelt_appengine.monkeypatch()

队列.yaml:

queue:
- name: default
rate: 100/s
bucket_size: 500
max_concurrent_requests: 1000

本质上,对 MainHandler 的 POST 将启动 2000 个尝试创建出站请求(对自身)的延迟任务。

去这里监控批处理:https://console.cloud.google.com/appengine/taskqueues您应该会看到大部分任务很快完成,除了少数任务会继续说它们正在运行 10 分钟。 enter image description here

完成后,它们将出现在日志查看器中。

出于某种原因,如果我不禁用然后重新启用该项目,该问题只会发生在第一批 2000 件上。

在您能够重现问题之后。如果您对 urllib3 进行更改,则该问题应该不会再发生。

这是 urllib3 版本 1.22

关于google-app-engine - 应用引擎实例立即死亡,锁定延迟任务,直到它们达到 10 分钟超时,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45425685/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com