gpt4 book ai didi

python - 替换嵌套的 for 循环结合条件以提高性能

转载 作者:行者123 更新时间:2023-12-05 04:20:25 25 4
gpt4 key购买 nike

为了加快我的代码速度,我想通过矢量化或其他推荐的工具来交换我的 for 循环。我发现了很多替换简单 for 循环的例子,但没有找到替换嵌套 for 循环结合条件的例子,我能够理解/会帮助我......

我想用我的代码检查点(X、Y 坐标)是否可以通过线性结构(线性结构)连接。我开始时非常简单,但随着时间的推移,代码会自行增长,现在速度很慢……这是花费最多时间的部分的工作示例:

import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import MultiLineString, LineString, Point
from shapely.affinity import rotate
from math import sqrt
from tqdm import tqdm
import random as rng

# creating random array of points
xys = rng.sample(range(201 * 201), 100)
points = [list(divmod(xy, 201)) for xy in xys]

# plot points
plt.scatter(*zip(*points))

# calculate length for rotating lines -> diagonal of bounds so all points able to be reached
length = sqrt(2)*200

# calculate angles to rotate lines
angles = []
for a in range(0, 360, 1):
angle = np.deg2rad(a)
angles.append(angle)

# copy points array to helper array (points_list) so original array is not manipulated
points_list = points.copy()

# array to save final lines
lines = []

# iterate over every point in points array to search for connecting lines
for point in tqdm(points):
# delete point from helper array to speed up iteration -> so points do not get
# double, triple, ... checked
if len(points_list) > 0:
points_list.remove(point)
else:
break
# create line from original point to point at end of line (x+length) - this line
# gets rotated at calculated angles
start = Point(point)
end = Point(start.x+length, start.y)
line = LineString([start,end])
# iterate over angle Array to rotate line by each angle
for angle in angles:
rot_line = rotate(line, angle, origin=start, use_radians=True)
lst = list(rot_line.coords)
# save starting point (a) and ending point(b) of rotated line for np.cross()
# (cross product to check if points on/near rotated line)
a = np.asarray(lst[0])
b = np.asarray(lst[1])
# counter to count number of points on/near line
count = 0
line_list = []
# iterate manipulated points_list array (only points left for which there has
# not been a line rotated yet)
for poi in points_list:
# check whether point (pio) is on/near rotated line by calculating cross
# product (np.corss())
p = np.asarray(poi)
cross = np.cross(p-a,b-a)
# check if poi is inside accepted deviation from cross product
if cross > -750 and cross < 750:
# check if more than 5 points (poi) are on/near the rotated line
if count < 5:
line_list.append(poi)
count += 1
# if 5 points are connected by the rotated line sort the coordinates
# of the points and check if the length of the line meets the criteria
else:
line_list = sorted(line_list , key=lambda k: [k[1], k[0]])
line_length = LineString(line_list)
if line_length.length >= 10 and line_length.length <= 150:
lines.append(line_list)
break

# use shapeplys' MultiLineString to create lines from coordinates and plot them
# afterwards
multiLines = MultiLineString(lines)

fig, ax = plt.subplots()
ax.set_title("Lines")
for multiLine in MultiLineString(multiLines).geoms:
# print(multiLine)
plt.plot(*multiLine.xy)

如上所述,它正在考虑使用 pandas 或 numpy 向量化,因此为点和线 (gdf) 构建一个 pandas df,并用不同的角度 (angles) 来旋转线:

<表类="s-表"><头>姓名类型大小值(value)<正文>gf数据框(122689, 6)列名:x,y,value,start,end,line角度数据框(360, 1)列名:角度

但是我想不出用 pandas 矢量化替换这个嵌套的 for 循环的条件。我找到了 this article on medium文章中途提到了矢量化的条件,我想知道我的代码是否因为循环内的依赖性而不适合矢量化...

如果这是正确的,则不一定需要矢量化一切可以提高性能的东西!

最佳答案

您可以很容易地向量化计算最密集的部分:最内层的循环。这个想法是一次计算 points_listnp.cross 可以应用于每一行,np.where 可以用来过滤结果(并获取 ID)。

这是(几乎没有测试过的)修改后的主循环:

for point in tqdm(points):
if len(points_list) > 0:
points_list.remove(point)
else:
break

start = Point(point)
end = Point(start.x+length, start.y)
line = LineString([start,end])

# CHANGED PART

if len(points_list) == 0:
continue

p = np.asarray(points_list)

for angle in angles:
rot_line = rotate(line, angle, origin=start, use_radians=True)
a, b = np.asarray(rot_line.coords)
cross = np.cross(p-a,b-a)
foundIds = np.where((cross > -750) & (cross < 750))[0]

if foundIds.size > 5:
# Similar to the initial part, not efficient, but rarely executed
line_list = p[foundIds][:5].tolist()
line_list = sorted(line_list, key=lambda k: [k[1], k[0]])
line_length = LineString(line_list)
if line_length.length >= 10 and line_length.length <= 150:
lines.append(line_list)

这在我的机器上快了大约 15 倍

大部分时间花在非常低效的 shapely 模块上(尤其是 rotate 甚至 np.asarray(rot_line.coords))。事实上,每次调用 rotate 大约需要 50 微秒,这简直是疯了:它应该不超过 50 纳秒,也就是说,快 1000 倍(实际上,优化的 native 代码应该能够做到这一点在我的机器上不到 20 ns)。如果您想要更快的代码,请考虑不使用此包(或提高其性能)。

关于python - 替换嵌套的 for 循环结合条件以提高性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74479770/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com