gpt4 book ai didi

scala - Spark Scala UDF 参数限制为 10

转载 作者:行者123 更新时间:2023-12-02 03:41:01 25 4
gpt4 key购买 nike

我需要创建一个具有 11 个参数的 Spark UDF。有什么办法可以实现吗?
我知道我们可以创建一个最多有 10 个参数的 UDF

下面是 10 个参数的代码。有效

val testFunc1 = (one: String, two: String, three: String, four: String,
five: String, six: String, seven: String, eight: String, nine: String, ten: String) => {
if (isEmpty(four)) false
else four match {
case "RDIS" => three == "ST"
case "TTSC" => nine == "UT" && eight == "RR"
case _ => false
}
}
import org.apache.spark.sql.functions.udf
udf(testFunc1)

下面是 11 个参数的代码。面临“未指定值参数:dataType”问题

val testFunc2 = (one: String, two: String, three: String, four: String,
five: String, six: String, seven: String, eight: String, nine: String, ten: String, ELEVEN: String) => {
if (isEmpty(four)) false
else four match {
case "RDIS" => three == "ST"
case "TTSC" => nine == "UT" && eight == "RR" && ELEVEN == "OR"
case _ => false
}
}
import org.apache.spark.sql.functions.udf
udf(testFunc2) // compilation error

最佳答案

我建议将参数打包在Map中:

import org.apache.spark.sql.functions._

val df = sc.parallelize(Seq(("a","b"),("c","d"),("e","f"))).toDF("one","two")


val myUDF = udf((input:Map[String,String]) => {
// do something with the input
input("one")=="a"
})

df
.withColumn("udf_args",map(
lit("one"),$"one",
lit("two"),$"one"
)
)
.withColumn("udf_result", myUDF($"udf_args"))
.show()

+---+---+--------------------+----------+
|one|two| udf_args|udf_result|
+---+---+--------------------+----------+
| a| b|Map(one -> a, two...| true|
| c| d|Map(one -> c, two...| false|
| e| f|Map(one -> e, two...| false|
+---+---+--------------------+----------+

关于scala - Spark Scala UDF 参数限制为 10,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48637297/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com