gpt4 book ai didi

scala - Spark - Scala - saveAsHadoopFile 抛出错误

转载 作者:行者123 更新时间:2023-12-01 11:35:39 25 4
gpt4 key购买 nike

我想解决问题,但无法进一步解决。谁能帮忙

import org.apache.hadoop.mapred.lib.MultipleTextOutputFormat

class KeyBasedOutput[T >: Null, V <: AnyRef] extends MultipleTextOutputFormat[T , V] {
override def generateFileNameForKeyValue(key: T, value: V, leaf: String) = {
key.toString
}
override def generateActualKey(key: T, value: V) = {
null
}
}

val cp1 =sqlContext.sql("select * from d_prev_fact").map(t => t.mkString("\t")).map{x => val parts = x.split("\t")
val partition_key = parts(3)
val rows = parts.slice(0, parts.length).mkString("\t")
("date=" + partition_key.toString, rows.toString)}

cp1.saveAsHadoopFile(FACT_CP)

我遇到如下错误,无法调试

scala> cp1.saveAsHadoopFile(FACT_CP,classOf[String],classOf[String],classOf[KeyBasedOutput[String, String]])
java.lang.RuntimeException: java.lang.NoSuchMethodException: $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$KeyBasedOutput.<init>()
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at org.apache.hadoop.mapred.JobConf.getOutputFormat(JobConf.java:709)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:742)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:674)

想法是根据Key将值写入多个文件夹

最佳答案

将 KeyBasedOutput 放入 jar 并启动 spark-shell --jars/path/to/the/jar

关于scala - Spark - Scala - saveAsHadoopFile 抛出错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25996822/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com