gpt4 book ai didi

scala - 从案例类创建 DataFrame

转载 作者:行者123 更新时间:2023-12-05 01:00:24 25 4
gpt4 key购买 nike

我已阅读其他相关问题,但没有找到答案。

我想从 Spark 2.3 中的案例类创建一个 DataFrame。斯卡拉 2.11.8。

代码

package org.XXX

import org.apache.spark.sql.SparkSession

object Test {

def main(args: Array[String]): Unit = {
val spark = SparkSession
.builder
.appName("test")
.getOrCreate()

case class Employee(Name:String, Age:Int, Designation:String, Salary:Int, ZipCode:Int)
val EmployeesData = Seq( Employee("Anto", 21, "Software Engineer", 2000, 56798))
val Employee_DataFrame = EmployeesData.toDF

spark.stop()
}
}

这是我的 tried在 Spark 壳中:

case class Employee(Name:String, Age:Int, Designation:String, Salary:Int, ZipCode:Int)
val EmployeesData = Seq( Employee("Anto", 21, "Software Engineer", 2000, 56798))
val Employee_DataFrame = EmployeesData.toDF

错误

java.lang.VerifyError: class org.apache.spark.sql.hive.HiveExternalCatalog overrides final method alterDatabase.(Lorg/apache/spark/sql/catalyst/catalog/CatalogDatabase;)V
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:53)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)
at org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1.<init>(HiveSessionStateBuilder.scala:69)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.analyzer(HiveSessionStateBuilder.scala:69)
at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$2.apply(BaseSessionStateBuilder.scala:293)
at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$2.apply(BaseSessionStateBuilder.scala:293)
at org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:79)
at org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:79)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:172)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:178)
at org.apache.spark.sql.Dataset$.apply(Dataset.scala:65)
at org.apache.spark.sql.SparkSession.createDataset(SparkSession.scala:470)
at org.apache.spark.sql.SQLContext.createDataset(SQLContext.scala:377)
at org.apache.spark.sql.SQLImplicits.localSeqToDatasetHolder(SQLImplicits.scala:228)

最佳答案

您从共享链接复制的代码段没有问题,因为错误说明它是别的东西(确切的代码复制结果在我下面的运行中)。

case class Employee(Name:String, Age:Int, Designation:String, Salary:Int, ZipCode:Int)
val EmployeesData = Seq( Employee("Anto", 21, "Software Engineer", 2000, 56798))
val Employee_DataFrame = EmployeesData.toDF
Employee_DataFrame.show()

Employee_DataFrame:org.apache.spark.sql.DataFrame = [名称:字符串,年龄:整数...另外 3 个字段]'

+----+---+-----------------+------+-------+
|Name|Age| Designation|Salary|ZipCode|
+----+---+-----------------+------+-------+
|Anto| 21|Software Engineer| 2000| 56798|
+----+---+-----------------+------+-------+

关于scala - 从案例类创建 DataFrame,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50318610/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com