gpt4 book ai didi

scala - Spark Scala 获取类未找到 scala.Any

转载 作者:行者123 更新时间:2023-12-01 04:31:33 26 4
gpt4 key购买 nike

val schema = df.schema
val x = df.flatMap(r =>
(0 until schema.length).map { idx =>
((idx, r.get(idx)), 1l)
}
)

这会产生错误
java.lang.ClassNotFoundException: scala.Any

我不知道为什么,有什么帮助吗?

最佳答案

一种方法是将所有列强制转换为 String。请注意,我正在将代码中的 r.get(idx) 更改为 r.getString(idx)。下面的作品。

scala> val df = Seq(("ServiceCent4","AP-1-IOO-PPP","241.206.155.172","06-12-18:17:42:34",162,53,1544098354885L)).toDF("COL1","COL2","COL3","EventTime","COL4","COL5","COL6")
df: org.apache.spark.sql.DataFrame = [COL1: string, COL2: string ... 5 more fields]

scala> df.show(1,false)
+------------+------------+---------------+-----------------+----+----+-------------+
|COL1 |COL2 |COL3 |EventTime |COL4|COL5|COL6 |
+------------+------------+---------------+-----------------+----+----+-------------+
|ServiceCent4|AP-1-IOO-PPP|241.206.155.172|06-12-18:17:42:34|162 |53 |1544098354885|
+------------+------------+---------------+-----------------+----+----+-------------+
only showing top 1 row

scala> df.printSchema
root
|-- COL1: string (nullable = true)
|-- COL2: string (nullable = true)
|-- COL3: string (nullable = true)
|-- EventTime: string (nullable = true)
|-- COL4: integer (nullable = false)
|-- COL5: integer (nullable = false)
|-- COL6: long (nullable = false)


scala> val schema = df.schema
schema: org.apache.spark.sql.types.StructType = StructType(StructField(COL1,StringType,true), StructField(COL2,StringType,true), StructField(COL3,StringType,true), StructField(EventTime,StringType,true), StructField(COL4,IntegerType,false), StructField(COL5,IntegerType,false), StructField(COL6,LongType,false))

scala> val df2 = df.columns.foldLeft(df){ (acc,r) => acc.withColumn(r,col(r).cast("string")) }
df2: org.apache.spark.sql.DataFrame = [COL1: string, COL2: string ... 5 more fields]

scala> df2.printSchema
root
|-- COL1: string (nullable = true)
|-- COL2: string (nullable = true)
|-- COL3: string (nullable = true)
|-- EventTime: string (nullable = true)
|-- COL4: string (nullable = false)
|-- COL5: string (nullable = false)
|-- COL6: string (nullable = false)


scala> val x = df2.flatMap(r => (0 until schema.length).map { idx => ((idx, r.getString(idx)), 1l) } )
x: org.apache.spark.sql.Dataset[((Int, String), Long)] = [_1: struct<_1: int, _2: string>, _2: bigint]

scala> x.show(5,false)
+---------------------+---+
|_1 |_2 |
+---------------------+---+
|[0,ServiceCent4] |1 |
|[1,AP-1-IOO-PPP] |1 |
|[2,241.206.155.172] |1 |
|[3,06-12-18:17:42:34]|1 |
|[4,162] |1 |
+---------------------+---+
only showing top 5 rows


scala>

关于scala - Spark Scala 获取类未找到 scala.Any,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53766957/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com