gpt4 book ai didi

python - "bucketsort"使用 python 多处理

转载 作者:行者123 更新时间:2023-12-01 05:08:51 24 4
gpt4 key购买 nike

我有一个均匀分布的数据系列。我希望利用分布对数据进行并行排序。对于N个CPU,我本质上定义了N个桶并对桶进行并行排序。我的问题是,我没有得到加速。

出了什么问题?

from multiprocessing import Process, Queue
from numpy import array, linspace, arange, where, cumsum, zeros
from numpy.random import rand
from time import time


def my_sort(x,y):
y.put(x.get().argsort())

def my_par_sort(X,np):
p_list=[]
Xq = Queue()
Yq = Queue()
bmin = linspace(X.min(),X.max(),np+1) #bucket lower bounds
bmax = array(bmin); bmax[-1] = X.max()+1 #bucket upper bounds
B = []
Bsz = [0]
for i in range(np):
b = array([bmin[i] <= X, X < bmax[i+1]]).all(0)
B.append(where(b)[0])
Bsz.append(len(B[-1]))
Xq.put(X[b])
p = Process(target=my_sort, args=(Xq,Yq))
p.start()
p_list.append(p)

Bsz = cumsum(Bsz).tolist()
Y = zeros(len(X))
for i in range(np):
Y[arange(Bsz[i],Bsz[i+1])] = B[i][Yq.get()]
p_list[i].join()

return Y


if __name__ == '__main__':
num_el = 1e7
mydata = rand(num_el)
np = 4 #multiprocessing.cpu_count()
starttime = time()
I = my_par_sort(mydata,np)
print "Sorting %0.0e keys took %0.1fs using %0.0f processes" % (len(mydata),time()-starttime,np)
starttime = time()
I2 = mydata.argsort()
print "in serial it takes %0.1fs" % (time()-starttime)
print (I==I2).all()

最佳答案

看起来你的问题是当你将原始数组分成几部分时所增加的开销。我拿走了你的代码,只是删除了多处理的所有使用:

def my_sort(x,y): 
pass
#y.put(x.get().argsort())

def my_par_sort(X,np, starttime):
p_list=[]
Xq = Queue()
Yq = Queue()
bmin = linspace(X.min(),X.max(),np+1) #bucket lower bounds
bmax = array(bmin); bmax[-1] = X.max()+1 #bucket upper bounds
B = []
Bsz = [0]
for i in range(np):
b = array([bmin[i] <= X, X < bmax[i+1]]).all(0)
B.append(where(b)[0])
Bsz.append(len(B[-1]))
Xq.put(X[b])
p = Process(target=my_sort, args=(Xq,Yq, i))
p.start()
p_list.append(p)
return

if __name__ == '__main__':
num_el = 1e7
mydata = rand(num_el)
np = 4 #multiprocessing.cpu_count()
starttime = time()
I = my_par_sort(mydata,np, starttime)
print "Sorting %0.0e keys took %0.1fs using %0.0f processes" % (len(mydata),time()-starttime,np)
starttime = time()
I2 = mydata.argsort()
print "in serial it takes %0.1fs" % (time()-starttime)
#print (I==I2).all()

在完全不进行排序的情况下,多处理代码与串行代码一样长:

Sorting 1e+07 keys took 2.2s using 4 processes
in serial it takes 2.2s

您可能认为启动进程和在进程之间传递值的开销是开销的原因,但如果我删除 multiprocessing 的所有使用,包括 Xq.put( X[b]) 调用,结果只是稍微快一点:

Sorting 1e+07 keys took 1.9s using 4 processes
in serial it takes 2.2s

所以看来您需要研究一种更有效的方法将数组分成几部分。

关于python - "bucketsort"使用 python 多处理,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24576622/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com