gpt4 book ai didi

list - Spark 数据帧值到 Scala 列表

转载 作者:行者123 更新时间:2023-12-04 08:01:00 25 4
gpt4 key购买 nike

我有一个包含数组的列的数据框:

+----------------------------+
|User | Color |
+----------------------------+
|User1 | [Green,Blue,Red] |
|User2 | [Blue,Red] |
+----------------------------+
我正在尝试过滤 User1并将颜色列表放入 Scala 列表中:
val colorsList: List[String] = List("Green","Blue","Red")
这是我到目前为止所尝试的(输出添加为注释):
尝试 1:
val dfTest1 = myDataframe.where("User=='User1'").select("Color").rdd.map(r => r(0)).collect()
println(dfTest1) //[Ljava.lang.Object;@44022255
for(EachColor<- dfTest1){
println(EachColor) //WrappedArray(Green, Blue, Red)
}
尝试 2:
val dfTest2 = myDataframe.where("User=='User1'").select("Color").collectAsList.get(0).getList(0)
println(dfTest2) //[Green, Blue, Red] but type is util.List[Nothing]
尝试 3:
val dfTest32 = myDataframe.where("User=='User1'").select("Color").rdd.map(r => r(0)).collect.toList 
println(dfTest32) //List(WrappedArray(Green, Blue, Red))

for(EachColor <- dfTest32){
println(EachColor) //WrappedArray(Green, Blue, Red)
}
尝试 4:
val dfTest31 = myDataframe.where("User=='User1'").select("Color").map(r => r.getString(0)).collect.toList    
//Exception : scala.collection.mutable.WrappedArray$ofRef cannot be cast to java.lang.String

最佳答案

您可以尝试获取为 Seq[String]并转换 toList :

val colorsList = df.where("User=='User1'")
.select("Color")
.rdd.map(r => r.getAs[Seq[String]](0))
.collect()(0)
.toList
或等效地
val colorsList = df.where("User=='User1'")
.select("Color")
.collect()(0)
.getAs[Seq[String]](0)
.toList

关于list - Spark 数据帧值到 Scala 列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66470185/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com