gpt4 book ai didi

python - Python中Pareto前沿的快速计算

转载 作者:太空狗 更新时间:2023-10-29 17:02:21 24 4
gpt4 key购买 nike

我在 3D 空间中有一组点,我需要从中找到 Pareto 边界。执行速度在这里非常重要,随着我加分测试,时间增加得非常快。

点集如下所示:

[[0.3296170319979843, 0.0, 0.44472108843537406], [0.3296170319979843,0.0, 0.44472108843537406], [0.32920760896951373, 0.0, 0.4440408163265306], [0.32920760896951373, 0.0, 0.4440408163265306], [0.33815192743764166, 0.0, 0.44356462585034007]]

现在,我正在使用这个算法:

def dominates(row, candidateRow):
return sum([row[x] >= candidateRow[x] for x in range(len(row))]) == len(row)

def simple_cull(inputPoints, dominates):
paretoPoints = set()
candidateRowNr = 0
dominatedPoints = set()
while True:
candidateRow = inputPoints[candidateRowNr]
inputPoints.remove(candidateRow)
rowNr = 0
nonDominated = True
while len(inputPoints) != 0 and rowNr < len(inputPoints):
row = inputPoints[rowNr]
if dominates(candidateRow, row):
# If it is worse on all features remove the row from the array
inputPoints.remove(row)
dominatedPoints.add(tuple(row))
elif dominates(row, candidateRow):
nonDominated = False
dominatedPoints.add(tuple(candidateRow))
rowNr += 1
else:
rowNr += 1

if nonDominated:
# add the non-dominated point to the Pareto frontier
paretoPoints.add(tuple(candidateRow))

if len(inputPoints) == 0:
break
return paretoPoints, dominatedPoints

在这里找到:http://code.activestate.com/recipes/578287-multidimensional-pareto-front/

在一组解决方案中找到一组非支配解决方案的最快方法是什么?或者,简而言之,Python 能比这个算法做得更好吗?

最佳答案

如果您担心实际速度,您肯定想使用 numpy(因为巧妙的算法调整可能比使用数组操作获得的 yield 要小得多)。以下是三个计算相同函数的解决方案。 is_pareto_efficient_dumb 解决方案在大多数情况下较慢,但随着成本数量的增加而变得更快,is_pareto_efficient_simple 解决方案在许多点上比 dumb 解决方案更有效,最终is_pareto_efficient 函数可读性较差但速度最快(因此所有都是 Pareto 有效的!)。

import numpy as np


# Very slow for many datapoints. Fastest for many costs, most readable
def is_pareto_efficient_dumb(costs):
"""
Find the pareto-efficient points
:param costs: An (n_points, n_costs) array
:return: A (n_points, ) boolean array, indicating whether each point is Pareto efficient
"""
is_efficient = np.ones(costs.shape[0], dtype = bool)
for i, c in enumerate(costs):
is_efficient[i] = np.all(np.any(costs[:i]>c, axis=1)) and np.all(np.any(costs[i+1:]>c, axis=1))
return is_efficient


# Fairly fast for many datapoints, less fast for many costs, somewhat readable
def is_pareto_efficient_simple(costs):
"""
Find the pareto-efficient points
:param costs: An (n_points, n_costs) array
:return: A (n_points, ) boolean array, indicating whether each point is Pareto efficient
"""
is_efficient = np.ones(costs.shape[0], dtype = bool)
for i, c in enumerate(costs):
if is_efficient[i]:
is_efficient[is_efficient] = np.any(costs[is_efficient]<c, axis=1) # Keep any point with a lower cost
is_efficient[i] = True # And keep self
return is_efficient


# Faster than is_pareto_efficient_simple, but less readable.
def is_pareto_efficient(costs, return_mask = True):
"""
Find the pareto-efficient points
:param costs: An (n_points, n_costs) array
:param return_mask: True to return a mask
:return: An array of indices of pareto-efficient points.
If return_mask is True, this will be an (n_points, ) boolean array
Otherwise it will be a (n_efficient_points, ) integer array of indices.
"""
is_efficient = np.arange(costs.shape[0])
n_points = costs.shape[0]
next_point_index = 0 # Next index in the is_efficient array to search for
while next_point_index<len(costs):
nondominated_point_mask = np.any(costs<costs[next_point_index], axis=1)
nondominated_point_mask[next_point_index] = True
is_efficient = is_efficient[nondominated_point_mask] # Remove dominated points
costs = costs[nondominated_point_mask]
next_point_index = np.sum(nondominated_point_mask[:next_point_index])+1
if return_mask:
is_efficient_mask = np.zeros(n_points, dtype = bool)
is_efficient_mask[is_efficient] = True
return is_efficient_mask
else:
return is_efficient

分析测试(使用从正态分布中抽取的点):

有 10,000 个样本,2 个成本:

is_pareto_efficient_dumb: Elapsed time is 1.586s
is_pareto_efficient_simple: Elapsed time is 0.009653s
is_pareto_efficient: Elapsed time is 0.005479s

有 1,000,000 个样本,2 个成本:

is_pareto_efficient_dumb: Really, really, slow
is_pareto_efficient_simple: Elapsed time is 1.174s
is_pareto_efficient: Elapsed time is 0.4033s

有 10,000 个样本,15 个成本:

is_pareto_efficient_dumb: Elapsed time is 4.019s
is_pareto_efficient_simple: Elapsed time is 6.466s
is_pareto_efficient: Elapsed time is 6.41s

请注意,如果效率是一个问题,您可以通过预先对数据重新排序来获得大约 2 倍左右的加速,请参阅 here .

关于python - Python中Pareto前沿的快速计算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32791911/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com