gpt4 book ai didi

apache-spark - Spark UDAF : java. lang.InternalError:类名格式错误

转载 作者:行者123 更新时间:2023-12-04 19:41:43 26 4
gpt4 key购买 nike

我正在使用 CDH 5.5.2 发行版中的 Spark 1.5.0。我从 2.10.4 切换到 Scala 2.10.5。我正在为 UDAF 使用以下代码。这是 String vs UTF8String 的问题吗?如果是,任何帮助将不胜感激。

object GroupConcat extends UserDefinedAggregateFunction {
def inputSchema = new StructType().add("x", StringType)
def bufferSchema = new StructType().add("buff", ArrayType(StringType))
def dataType = StringType
def deterministic = true

def initialize(buffer: MutableAggregationBuffer) = {
buffer.update(0, ArrayBuffer.empty[String])
}

def update(buffer: MutableAggregationBuffer, input: Row) = {
if (!input.isNullAt(0))
buffer.update(0, buffer.getSeq[String](0) :+ input.getString(0))
}

def merge(buffer1: MutableAggregationBuffer, buffer2: Row) = {
buffer1.update(0, buffer1.getSeq[String](0) ++ buffer2.getSeq[String](0))
}

def evaluate(buffer: Row) = UTF8String.fromString(
buffer.getSeq[String](0).mkString(","))
}

但是,我在运行时收到此错误消息:
Exception in thread "main" java.lang.InternalError: Malformed class name
at java.lang.Class.getSimpleName(Class.java:1190)
at org.apache.spark.sql.execution.aggregate.ScalaUDAF.toString(udaf.scala:464)
at java.lang.String.valueOf(String.java:2847)
at java.lang.StringBuilder.append(StringBuilder.java:128)
at scala.StringContext.standardInterpolator(StringContext.scala:122)
at scala.StringContext.s(StringContext.scala:90)
at org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression2.toString(interfaces.scala:96)
at org.apache.spark.sql.catalyst.expressions.Expression.prettyString(Expression.scala:174)
at org.apache.spark.sql.GroupedData$$anonfun$1.apply(GroupedData.scala:86)
at org.apache.spark.sql.GroupedData$$anonfun$1.apply(GroupedData.scala:80)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.spark.sql.GroupedData.toDF(GroupedData.scala:80)
at org.apache.spark.sql.GroupedData.agg(GroupedData.scala:227)

最佳答案

我收到了同样的异常,因为我扩展 UserDefinedAggregateFunction 的对象在另一个函数内。

改变这个:

object Driver {
def main(args: Array[String]) {

object GroupConcat extends UserDefinedAggregateFunction {
...
}
}
}

对此:
object Driver {
def main(args: Array[String]) {
...
}

object GroupConcat extends UserDefinedAggregateFunction {
...
}
}

关于apache-spark - Spark UDAF : java. lang.InternalError:类名格式错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37959985/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com