gpt4 book ai didi

python - Map-Reduce 使用 Hadoop 解决 python 中的矩阵乘法

转载 作者:可可西里 更新时间:2023-11-01 15:51:38 25 4
gpt4 key购买 nike

我想应用 map-reduce 来处理 python 和 Hadoop 中的矩阵乘法。目标是计算 A * B。输出应该与输入相似。

输入是两个矩阵 A 和 B 甲酸盐看起来像这样:

A,0,0,0.0
A,0,1,1.0
...
A,1,3,8.0
A,1,4,9.0
B,0,0,0.0
B,0,1,1.0
...
B,4,0,12.0
B,4,1,13.0

A,0,0,0.0表示索引为A(0,0),值为0.0,B同理

这是我的 map 功能:

import sys
import string
import numpy
#Split line into array of entry data
entry = line.split(",")
# Set row, column, and value for this entry
row = int(entry[1])
col = int(entry[2])
value = float(entry[3])

#If this is an entry in matrix A...
if (entry[0] == "A"):

#Generate the necessary key-value pairs
for i in range(col):
print('<{}{},{} {} {}}>'.format(row,i,A,col,value))
#Otherwise, if this is an entry in matrix B...
else:
#Generate the necessary key-value pairs
for i in range(row):
print('<{}{},{} {} {}}>'.format(i,col,B,row,value))

我想知道如何编写 reduce 函数。这是我将使用的框架:

import sys
import string
import numpy

#number of columns of A/rows of B
n = int(sys.argv[1])

#Create data structures to hold the current row/column values (if needed; your code goes here)



currentkey = None

# input comes from STDIN (stream data that goes to the program)
for line in sys.stdin:

#Remove leading and trailing whitespace
line = line.strip()

#Get key/value
key, value = line.split('\t',1)

#Parse key/value input (your code goes here)

#If we are still on the same key...
if key==currentkey:

#Process key/value pair (your code goes here)


#Otherwise, if this is a new key...
else:
#If this is a new key and not the first key we've seen
if currentkey:

#compute/output result to STDOUT (your code goes here)

currentkey = key

#Process input for new key (your code goes here)

#Compute/output result for the last key (your code goes here)

为了运行这两个函数,我将使用以下代码使用一个小型测试数据集来测试它们:

cat smalltest.txt | python src/map.py 2 3 | sort -n | python src/reduce.py 5

Map给出 的输出,然后用sort -n对key进行排序,所以我会用reducer来处理矩阵计算。我的困惑在于编写 reducer 函数。

最佳答案

不知道为什么要减少
我的 numpy 方法(使用一些 string/list/zip 技巧)

 strin = '''A,0,0,0.0
A,0,1,1.0
A,1,0,8.0
A,1,1,9.0
B,0,0,0.0
B,0,1,1.0
B,1,0,12.0
B,1,1,13.0'''.split()

lines = [*map(lambda x: x.split(","),strin)]

linesT = [*zip(*lines)]

linesT

[('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
('0', '0', '1', '1', '0', '0', '1', '1'),
('0', '1', '0', '1', '0', '1', '0', '1'),
('0.0', '1.0', '8.0', '9.0', '0.0', '1.0', '12.0', '13.0')]

现在我们可以得到 dims,数组 A,B 的数据

lastA = linesT[0].index("B") - 1

rowsA, colsA = int(linesT[1][lastA]) + 1, int(linesT[2][lastA]) + 1

datA = [*map(float, linesT[3][0:lastA + 1])]

A = np.array(datA).reshape((rowsA, colsA))

A
Out[50]:
array([[ 0., 1.],
[ 8., 9.]])

firstB = lastA + 1

rowsB, colsB = int(linesT[1][-1]) + 1, int(linesT[2][-1]) + 1

datB = [*map(float, linesT[3][firstB::])]

B = np.array(datB).reshape((rowsB, colsB))

B
Out[51]:
array([[ 0., 1.],
[ 12., 13.]])

A @ B
Out[52]:
array([[ 12., 13.],
[ 108., 125.]])

关于python - Map-Reduce 使用 Hadoop 解决 python 中的矩阵乘法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48649477/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com