gpt4 book ai didi

scala - 如何从 org.apache.spark.mllib.linalg.SparseVector 转换为 org.apache.spark.ml.linalg.SparseVector?

转载 作者:行者123 更新时间:2023-12-03 18:46:26 25 4
gpt4 key购买 nike

如何从org.apache.spark.mllib.linalg.SparseVectororg.apache.spark.ml.linalg.SparseVector ?

我正在从 mllib 转换代码到ml api。

import org.apache.spark.mllib.linalg.{DenseVector, Vector}
import org.apache.spark.ml.linalg.{DenseVector => NewDenseVector, Vector => NewVector}
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.ml.feature.{LabeledPoint => NewLabeledPoint}

val labelPointData = limitedTable.rdd.map { row =>
new NewLabeledPoint(convertToDouble(row.head), row(1).asInstanceOf[org.apache.spark.ml.linalg.SparseVector])
}

声明 row(1).asInstanceOf[org.apache.spark.ml.linalg.SparseVector]由于以下异常而无法正常工作:
org.apache.spark.mllib.linalg.SparseVector cannot be cast to org.apache.spark.ml.linalg.SparseVector
如何克服它?

我发现从 mllib 转换的代码至 ml但反之亦然。

最佳答案

可以双向转换。首先,让我们创建一个 mllib SparseVector :

import org.apache.spark.mllib.linalg.Vectors
val mllibVec: org.apache.spark.mllib.linalg.Vector = Vectors.sparse(3, Array(1,2,3), Array(1,2,3))

转换为 ML SparseVector , 只需使用 asML :
val mlVec: org.apache.spark.ml.linalg.Vector = mllibVec.asML

要再次转换回来,最简单的方法是使用 Vectors.fromML() :
val mllibVec2: org.apache.spark.mllib.linalg.Vector = Vectors.fromML(mlVec)

此外,在您的代码中,而不是 row(1).asInstanceOf[SparseVector]你可以试试 row.getAs[SparseVector](1) .尝试将向量读取为 mllib向量,然后将其转换为 asML并传入基于 ML 的 LabeledPoint , IE。:
val labelPointData = limitedTable.rdd.map { row =>
NewLabeledPoint(convertToDouble(row.head), row.getAs[org.apache.spark.mllb.linalg.SparseVector](1).asML)
}

关于scala - 如何从 org.apache.spark.mllib.linalg.SparseVector 转换为 org.apache.spark.ml.linalg.SparseVector?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46639021/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com