- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我正在通过HTTP连接到本地服务器(OSRM),以提交路由并获取行驶时间。我注意到I/O比线程处理慢,因为似乎计算的等待时间小于发送请求和处理JSON输出所花费的时间(我认为服务器花费一些时间来处理I/O更好。处理您的请求->您不希望它被阻止,因为您必须等待,这不是我的情况)。线程受全局解释器锁的影响,因此(以下证据表明)我最快的选择是使用多处理。
多重处理的问题在于它是如此之快,以至于耗尽了我的套接字,并且我得到了一个错误(每次请求都会发出一个新的连接)。我可以(以串行方式)使用requests.Sessions()对象使连接保持 Activity 状态,但是我无法使其并行工作(每个进程都有其自己的 session )。
目前,我需要处理的最接近的代码是以下多处理代码:
conn_pool = HTTPConnectionPool(host='127.0.0.1', port=5005, maxsize=cpu_count())
def ReqOsrm(url_input):
ul, qid = url_input
try:
response = conn_pool.request('GET', ul)
json_geocode = json.loads(response.data.decode('utf-8'))
status = int(json_geocode['status'])
if status == 200:
tot_time_s = json_geocode['route_summary']['total_time']
tot_dist_m = json_geocode['route_summary']['total_distance']
used_from, used_to = json_geocode['via_points']
out = [qid, status, tot_time_s, tot_dist_m, used_from[0], used_from[1], used_to[0], used_to[1]]
return out
else:
print("Done but no route: %d %s" % (qid, req_url))
return [qid, 999, 0, 0, 0, 0, 0, 0]
except Exception as err:
print("%s: %d %s" % (err, qid, req_url))
return [qid, 999, 0, 0, 0, 0, 0, 0]
# run:
pool = Pool(cpu_count())
calc_routes = pool.map(ReqOsrm, url_routes)
pool.close()
pool.join()
HTTPConnectionPool(host='127.0.0.1', port=5005): Max retries exceeded with url: /viaroute?loc=44.779708,4.2609877&loc=44.648439,4.2811959&alt=false&geometry=false (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10048] Only one usage of each socket address (protocol/network address/port) is normally permitted',))
def ReqOsrm(url_input):
req_url, query_id = url_input
try_c = 0
#print(req_url)
while try_c < 5:
try:
response = requests.get(req_url)
json_geocode = response.json()
status = int(json_geocode['status'])
# Found route between points
if status == 200:
....
pool = Pool(cpu_count()-1)
calc_routes = pool.map(ReqOsrm, url_routes)
HTTPConnectionPool(host='127.0.0.1', port=5000): Max retries exceeded with url: /viaroute?loc=49.34343,3.30199&loc=49.56655,3.25837&alt=false&geometry=false (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10048] Only one usage of each socket address (protocol/network address/port) is normally permitted',))
Blocking Or Non-Blocking?
With the default Transport Adapter in place, Requests does not provide any kind of non-blocking IO. The Response.content property will block until the entire response has been downloaded. If you require more granularity, the streaming features of the library (see Streaming Requests) allow you to retrieve smaller quantities of the response at a time. However, these calls will still block.
If you are concerned about the use of blocking IO, there are lots of projects out there that combine Requests with one of Python’s asynchronicity frameworks.
Two excellent examples are grequests and requests-futures.
from requests_futures.sessions import FuturesSession
from concurrent.futures import ThreadPoolExecutor, as_completed
calc_routes = []
futures = {}
with FuturesSession(executor=ThreadPoolExecutor(max_workers=1000)) as session:
# Submit requests and process in background
for i in range(len(url_routes)):
url_in, qid = url_routes[i] # url |query-id
future = session.get(url_in, background_callback=lambda sess, resp: ReqOsrm(sess, resp))
futures[future] = qid
# Process the futures as they become complete
for future in as_completed(futures):
r = future.result()
try:
row = [futures[future]] + r.data
except Exception as err:
print('No route')
row = [futures[future], 999, 0, 0, 0, 0, 0, 0]
calc_routes.append(row)
def ReqOsrm(sess, resp):
json_geocode = resp.json()
status = int(json_geocode['status'])
# Found route between points
if status == 200:
tot_time_s = json_geocode['route_summary']['total_time']
tot_dist_m = json_geocode['route_summary']['total_distance']
used_from = json_geocode['via_points'][0]
used_to = json_geocode['via_points'][1]
out = [status, tot_time_s, tot_dist_m, used_from[0], used_from[1], used_to[0], used_to[1]]
# Cannot find route between points (code errors as 999)
else:
out = [999, 0, 0, 0, 0, 0, 0]
resp.data = out
def doWork():
while True:
url,qid = q.get()
status, resp = getReq(url)
processReq(status, resp, qid)
q.task_done()
def getReq(url):
try:
resp = requests.get(url)
return resp.status_code, resp
except:
return 999, None
def processReq(status, resp, qid):
try:
json_geocode = resp.json()
# Found route between points
if status == 200:
tot_time_s = json_geocode['route_summary']['total_time']
tot_dist_m = json_geocode['route_summary']['total_distance']
used_from = json_geocode['via_points'][0]
used_to = json_geocode['via_points'][1]
out = [qid, status, tot_time_s, tot_dist_m, used_from[0], used_from[1], used_to[0], used_to[1]]
else:
print("Done but no route")
out = [qid, 999, 0, 0, 0, 0, 0, 0]
except Exception as err:
print("Error: %s" % err)
out = [qid, 999, 0, 0, 0, 0, 0, 0]
qres.put(out)
return
#Run:
concurrent = 1000
qres = Queue()
q = Queue(concurrent)
for i in range(concurrent):
t = Thread(target=doWork)
t.daemon = True
t.start()
try:
for url in url_routes:
q.put(url)
q.join()
except Exception:
pass
# Get results
calc_routes = [qres.get() for _ in range(len(url_routes))]
ERROR:tornado.application:Multiple exceptions in yield list Traceback (most recent call last): File "C:\Anaconda3\lib\site-packages\tornado\gen.py", line 789, in callback result_list.append(f.result()) File "C:\Anaconda3\lib\site-packages\tornado\concurrent.py", line 232, in result raise_exc_info(self._exc_info) File "", line 3, in raise_exc_info tornado.httpclient.HTTPError: HTTP 599: Timeout
def handle_req(r):
try:
json_geocode = json_decode(r)
status = int(json_geocode['status'])
tot_time_s = json_geocode['route_summary']['total_time']
tot_dist_m = json_geocode['route_summary']['total_distance']
used_from = json_geocode['via_points'][0]
used_to = json_geocode['via_points'][1]
out = [status, tot_time_s, tot_dist_m, used_from[0], used_from[1], used_to[0], used_to[1]]
print(out)
except Exception as err:
print(err)
out = [999, 0, 0, 0, 0, 0, 0]
return out
# Configure
# For some reason curl_httpclient crashes my computer
AsyncHTTPClient.configure("tornado.simple_httpclient.SimpleAsyncHTTPClient", max_clients=10)
@gen.coroutine
def run_experiment(urls):
http_client = AsyncHTTPClient()
responses = yield [http_client.fetch(url) for url, qid in urls]
responses_out = [handle_req(r.body) for r in responses]
raise gen.Return(value=responses_out)
# Initialise
_ioloop = ioloop.IOLoop.instance()
run_func = partial(run_experiment, url_routes)
calc_routes = _ioloop.run_sync(run_func)
import asyncio
import aiohttp
def handle_req(data, qid):
json_geocode = json.loads(data.decode('utf-8'))
status = int(json_geocode['status'])
if status == 200:
tot_time_s = json_geocode['route_summary']['total_time']
tot_dist_m = json_geocode['route_summary']['total_distance']
used_from = json_geocode['via_points'][0]
used_to = json_geocode['via_points'][1]
out = [qid, status, tot_time_s, tot_dist_m, used_from[0], used_from[1], used_to[0], used_to[1]]
else:
print("Done, but not route for {0} - status: {1}".format(qid, status))
out = [qid, 999, 0, 0, 0, 0, 0, 0]
return out
def chunked_http_client(num_chunks):
# Use semaphore to limit number of requests
semaphore = asyncio.Semaphore(num_chunks)
@asyncio.coroutine
# Return co-routine that will download files asynchronously and respect
# locking fo semaphore
def http_get(url, qid):
nonlocal semaphore
with (yield from semaphore):
response = yield from aiohttp.request('GET', url)
body = yield from response.content.read()
yield from response.wait_for_close()
return body, qid
return http_get
def run_experiment(urls):
http_client = chunked_http_client(500)
# http_client returns futures
# save all the futures to a list
tasks = [http_client(url, qid) for url, qid in urls]
response = []
# wait for futures to be ready then iterate over them
for future in asyncio.as_completed(tasks):
data, qid = yield from future
try:
out = handle_req(data, qid)
except Exception as err:
print("Error for {0} - {1}".format(qid,err))
out = [qid, 999, 0, 0, 0, 0, 0, 0]
response.append(out)
return response
# Run:
loop = asyncio.get_event_loop()
calc_routes = loop.run_until_complete(run_experiment(url_routes))
最佳答案
在问题的顶部查看您的多处理代码。似乎每次调用ReqOsrm时都会调用HttpConnectionPool()
。因此,将为每个URL创建一个新的池。而是使用initializer
和args
参数为每个进程创建一个池。
conn_pool = None
def makePool(host, port):
global conn_pool
pool = conn_pool = HTTPConnectionPool(host=host, port=port, maxsize=1)
def ReqOsrm(url_input):
ul, qid = url_input
try:
response = conn_pool.request('GET', ul)
json_geocode = json.loads(response.data.decode('utf-8'))
status = int(json_geocode['status'])
if status == 200:
tot_time_s = json_geocode['route_summary']['total_time']
tot_dist_m = json_geocode['route_summary']['total_distance']
used_from, used_to = json_geocode['via_points']
out = [qid, status, tot_time_s, tot_dist_m, used_from[0], used_from[1], used_to[0], used_to[1]]
return out
else:
print("Done but no route: %d %s" % (qid, req_url))
return [qid, 999, 0, 0, 0, 0, 0, 0]
except Exception as err:
print("%s: %d %s" % (err, qid, req_url))
return [qid, 999, 0, 0, 0, 0, 0, 0]
if __name__ == "__main__":
# run:
pool = Pool(initializer=makePool, initargs=('127.0.0.1', 5005))
calc_routes = pool.map(ReqOsrm, url_routes)
pool.close()
pool.join()
for future in as_completed(futures):
在外循环下缩进
for i in range(len(url_routes)):
。因此,在外循环中发出一个请求,然后内循环等待该 future 返回,然后再进行外循环的下一次迭代。这使得请求以串行方式而不是并行方式运行。
calc_routes = []
futures = {}
with FuturesSession(executor=ThreadPoolExecutor(max_workers=1000)) as session:
# Submit all the requests and process in background
for i in range(len(url_routes)):
url_in, qid = url_routes[i] # url |query-id
future = session.get(url_in, background_callback=lambda sess, resp: ReqOsrm(sess, resp))
futures[future] = qid
# this was indented under the code in section B of the question
# process the futures as they become copmlete
for future in as_completed(futures):
r = future.result()
try:
row = [futures[future]] + r.data
except Exception as err:
print('No route')
row = [futures[future], 999, 0, 0, 0, 0, 0, 0]
print(row)
calc_routes.append(row)
关于python - Python请求-线程/进程与IO,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35747235/
我是 Linux 的新手,并且继承了保持我们的单一 Linux 服务器运行的职责。这是我们的SVN服务器,所以比较重要。 原来在我之前维护它的人有一个 cron 任务,当有太多 svnserve 进程
Node 虽然自身存在多个线程,但是运行在 v8 上的 JavaScript 是单线程的。Node 的 child_process 模块用于创建子进程,我们可以通过子进程充分利用 CPU。范例:
Jenkins 有这么多进程处于事件状态是否正常? 我检查了我的设置,我只配置了 2 个“执行者”... htop http://d.pr/i/RZzG+ 最佳答案 您不仅要限制 Master 中的执
我正在尝试在 scala 中运行这样的 bash 命令: cat "example file.txt" | grep abc Scala 有一个特殊的流程管道语法,所以这是我的第一个方法: val f
很难说出这里要问什么。这个问题模棱两可、含糊不清、不完整、过于宽泛或夸夸其谈,无法以目前的形式得到合理的回答。如需帮助澄清此问题以便重新打开,visit the help center . 关闭 1
我需要一些帮助来理解并发编程的基础知识。事实上,我读得越多,就越感到困惑。因此,我理解进程是顺序执行的程序的一个实例,并且它可以由一个或多个线程组成。在单核CPU中,一次只能执行一个线程,而在多核CP
我的问题是在上一次集成测试后服务器进程没有关闭。 在integration.rs中,我有: lazy_static! { static ref SERVER: Arc> = {
我正在使用 Scala scala.sys.process图书馆。 我知道我可以用 ! 捕获退出代码和输出 !!但是如果我想同时捕获两者呢? 我看过这个答案 https://stackoverflow
我正在开发一个C++类(MyClass.cpp),将其编译为动态共享库(MyClass.so)。 同一台Linux计算机上运行的两个不同应用程序将使用此共享库。 它们是两个不同的应用程序。它不是多线程
我在我的 C 程序中使用 recvfrom() 从多个客户端接收 UDP 数据包,这些客户端可以使用自定义用户名登录。一旦他们登录,我希望他们的用户名与唯一的客户端进程配对,这样服务器就可以通过数据包
如何更改程序,以便函数 function_delayed_1 和 function_delayed_2 仅同时执行一次: int main(int argc, char *argv[]) {
考虑这两个程序: //in #define MAX 50 int main(int argc, char* argv[]) { int *count; int fd=shm
请告诉我如何一次打开三个终端,这样我的项目就可以轻松执行,而不必打开三个终端三次然后运行三个exe文件。请问我们如何通过脚本来做到这一点,即打开三个终端并执行三个 exe 文件。 最佳答案 在后台运行
我编写了一个监控服务来跟踪一组进程,并在服务行为异常、内存使用率高、超出 CPU 运行时间等时发出通知。 这在我的本地计算机上运行良好,但我需要它指向远程机器并获取这些机器上的进程信息。 我的方法,在
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。 想改进这个问题?将问题更新为 on-topic对于堆栈溢出。 8年前关闭。 Improve this qu
我有一个允许用户上传文件的应用程序。上传完成后,必须在服务器上完成许多处理步骤(解压、存储、验证等...),因此稍后会在一切完成后通过电子邮件通知用户。 我见过很多示例,其中 System.Compo
这个问题对很多人来说可能听起来很愚蠢,但我想对这个话题有一个清晰的理解。例如:当我们在 linux(ubuntu, x86) 上构建一个 C 程序时,它会在成功编译和链接过程后生成 a.out。 a.
ps -eaf | grep java 命令在这里不是识别进程是否是 java 进程的解决方案,因为执行此命令后我的许多 java 进程未在输出中列出。 最佳答案 简答(希望有人写一个更全面的): 获
我有几个与内核态和用户态的 Windows 进程相关的问题。 如果我有一个 hello world 应用程序和一个暴露新系统调用 foo() 的 hello world 驱动程序,我很好奇在内核模式下
我找不到很多关于 Windows 中不受信任的完整性级别的信息,对此有一些疑问: 是否有不受信任的完整性级别进程可以创建命名对象的地方? (互斥锁、事件等) 不受信任的完整性级别进程是否应该能够打开一
我是一名优秀的程序员,十分优秀!