gpt4 book ai didi

java - 在 Spark Java 中正确输出矩阵

转载 作者:行者123 更新时间:2023-11-30 10:52:50 25 4
gpt4 key购买 nike

我想知道如何获得正确的输出,我希望输出与输入具有相同的格式。我只是不太确定如何映射 rowNatrix 以获得此输出。

输入文件

0,0,0.0
0,1,1.0
0,2,2.0
0,3,3.0
0,4,4.0
1,0,5.0
1,1,6.0
1,2,7.0
1,3,8.0
1,4,9.0

代码

String inputPathA = "data/At.txt";
SparkConf conf = new SparkConf().setMaster("local");
JavaSparkContext sc = new JavaSparkContext(conf);

JavaRDD<String> fileA = sc.textFile(inputPathA);

JavaRDD<MatrixEntry> matrixA = fileA.map(new Function<String, MatrixEntry>() {
public MatrixEntry call(String x){
String[] indeceValue = x.split(",");
long i = Long.parseLong(indeceValue[0]);
long j = Long.parseLong(indeceValue[1]);
double value = Double.parseDouble(indeceValue[2]);
return new MatrixEntry(i, j, value );
}
});

CoordinateMatrix cooMatrixA = new CoordinateMatrix(matrixA.rdd());
BlockMatrix matA = cooMatrixA.toBlockMatrix();
BlockMatrix ata = matA.transpose().multiply(matA);
IndexedRowMatrix id = ata.toIndexedRowMatrix();
RowMatrix rm = id.toRowMatrix();
RDD<Vector> result = rm.rows();
result.saveAsTextFile("data/output1")

我得到的输出

(5,[0,1,2,3,4],[45.0,58.0,71.0,84.0,97.0])
(5,[0,1,2,3,4],[25.0,30.0,35.0,40.0,45.0])
(5,[0,1,2,3,4],[30.0,37.0,44.0,51.0,58.0])
(5,[0,1,2,3,4],[40.0,51.0,62.0,73.0,84.0])
(5,[0,1,2,3,4],[35.0,44.0,53.0,62.0,71.0])

如何在 Spark (Java) 中将其正确映射为与我的输入相同?

最佳答案

rowMatrix没有有意义的行索引,因此无法将其转换回与输入相同的形状。相反,您只需转换 BlockMatrix返回CoordinateMatrix并准备JavaRDD<String>可以保存:

JavaRDD<MatrixEntry> entries = ata.toCoordinateMatrix().entries().toJavaRDD();
JavaRDD<String> output = entries.map(new Function<MatrixEntry, String>() {
public String call(MatrixEntry e) {
return String.format("%d,%d,%s", e.i(), e.j(), e.value());
}
});
output.saveAsTextFile("data/output1");

关于java - 在 Spark Java 中正确输出矩阵,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34258825/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com