gpt4 book ai didi

scala - 在本地机器上进行 Spark 测试

转载 作者:行者123 更新时间:2023-12-01 10:33:33 25 4
gpt4 key购买 nike

我正在使用 sbt 测试在 Spark 1.3.1 上运行单元测试,除了单元测试非常慢之外,我一直遇到 java.lang.ClassNotFoundException: org.apache.spark.storage.RDDBlockId 问题。通常这意味着依赖性问题,但我不知道从哪里来的。尝试在新机器上安装所有东西,包括新的 hadoop、新的 ivy2,但我仍然遇到同样的问题

非常感谢任何帮助

异常(exception):

Exception in thread "Driver Heartbeater" java.lang.ClassNotFoundException: 
org.apache.spark.storage.RDDBlockId
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)

我的构建.sbt:

libraryDependencies ++=  Seq( 
"org.scalaz" %% "scalaz-core" % "7.1.2" excludeAll ExclusionRule(organization = "org.slf4j"),
"com.typesafe.play" %% "play-json" % "2.3.4" excludeAll ExclusionRule(organization = "org.slf4j"),
"org.apache.spark" %% "spark-core" % "1.3.1" % "provided" withSources() excludeAll (ExclusionRule(organization = "org.slf4j"), ExclusionRule("org.spark-project.akka", "akka-actor_2.10")),
"org.apache.spark" %% "spark-graphx" % "1.3.1" % "provided" withSources() excludeAll (ExclusionRule(organization = "org.slf4j"), ExclusionRule("org.spark-project.akka", "akka-actor_2.10")),
"org.apache.cassandra" % "cassandra-all" % "2.1.6",
"org.apache.cassandra" % "cassandra-thrift" % "2.1.6",
"com.typesafe.akka" %% "akka-actor" % "2.3.11",
"com.datastax.cassandra" % "cassandra-driver-core" % "2.1.6" withSources() withJavadoc() excludeAll (ExclusionRule(organization = "org.slf4j"),ExclusionRule(organization = "org.apache.spark"),ExclusionRule(organization = "com.twitter",name = "parquet-hadoop-bundle")),
"com.github.nscala-time" %% "nscala-time" % "1.2.0" excludeAll ExclusionRule(organization = "org.slf4j") withSources(),
"com.datastax.spark" %% "spark-cassandra-connector-embedded" % "1.3.0-M2" excludeAll (ExclusionRule(organization = "org.slf4j"),ExclusionRule(organization = "org.apache.spark"),ExclusionRule(organization = "com.twitter",name = "parquet-hadoop-bundle")),
"com.datastax.spark" %% "spark-cassandra-connector" % "1.3.0-M2" excludeAll (ExclusionRule(organization = "org.slf4j"),ExclusionRule(organization = "org.apache.spark"),ExclusionRule(organization = "com.twitter",name = "parquet-hadoop-bundle")),
"org.slf4j" % "slf4j-api" % "1.6.1",
"com.twitter" % "jsr166e" % "1.1.0",
"org.slf4j" % "slf4j-nop" % "1.6.1" % "test",
"org.scalatest" %% "scalatest" % "2.2.1" % "test" excludeAll ExclusionRule(organization = "org.slf4j")
)

和我的 spark 测试设置(为了测试我已经禁用了所有设置)

(spark.kryo.registrator,com.my.spark.MyRegistrator) 
(spark.eventLog.dir,)
(spark.driver.memory,16G)
(spark.kryoserializer.buffer.mb,512)
(spark.akka.frameSize,5)
(spark.shuffle.spill,false)
(spark.default.parallelism,8)
(spark.shuffle.consolidateFiles,false)
(spark.serializer,org.apache.spark.serializer.KryoSerializer)
(spark.shuffle.spill.compress,false)
(spark.driver.host,10.10.68.66)
(spark.akka.timeout,300)
(spark.driver.port,55328)
(spark.eventLog.enabled,false)
(spark.cassandra.connection.host,127.0.0.1)
(spark.cassandra.connection.ssl.enabled,false)
(spark.master,local[8])
(spark.cassandra.connection.ssl.trustStore.password,password)
(spark.fileserver.uri,http://10.10.68.66:55329)
(spark.cassandra.auth.username,username)
(spark.local.dir,/tmp/spark)
(spark.app.id,local-1436229075894)
(spark.storage.blockManagerHeartBeatMs,300000)
(spark.executor.id,<driver>)
(spark.storage.memoryFraction,0.5)
(spark.app.name,Count all entries 217885402)
(spark.shuffle.compress,false)

发送到 standalone 或 mesos 的组装或打包的 jar 工作正常!有什么建议吗?

最佳答案

我们在 Spark 1.6.0 中遇到了同样的问题(已经有一个 bug 报告)我们通过切换到 Kryo 序列化器(无论如何你应该使用它)来修复它。所以它似乎是默认 JavaSerializer 中的错误。

只需执行以下操作即可摆脱它:

new SparkConf().setAppName("Simple Application").set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")

关于scala - 在本地机器上进行 Spark 测试,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31280355/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com